> On May 11, 2015, 10:45 p.m., Andreas Hansson wrote: > > src/mem/packet_queue.cc, line 117 > > <http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117> > > > > I strongly disagree with this change. This buffer should not exist, and > > even 100 packets is a stretch. Any module hitting this needs to recondiser > > how it deals with flow control > > Brad Beckmann wrote: > I would agree with you that the buffer should not exist, but removing the > buffer goes well beyond this patch. The worst part about the buffer is the > size is not visible to the sender. It is a really bad design, but for now we > need to get around the immediate problem. Later we can discuss how and who > will fix this right. > > We've discussed the proposal in the past to increase the size, but > unfortunately we have not came to a resolution. We absolutely need to > resolve this now. We cannot use the ruby tester with our upcoming GPU model > without this change. > > The important thing to keep in mind is that the RubyTester is designed to > stress the protocol logic and it creates contention scenarios that would not > exist in a real system. The RubyTester finds bugs in the matter of seconds > that may not be encountered in hours of real workload simulation. It does > this by sending large bursts of racey requests. Bottomline: the RubyTester > needs this large value. > > Andreas Hansson wrote: > I would argue that if you need flow control, you should not use the > QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow > control. There is no point in adding flow control (or buffer visibility) to > the QueuePort in my opinion. > > I'd suggest to switch to a MasterPort either as part of this patch, or a > patch before it. That way you have no implicit buffer, and no need to create > kilobytes of invisible buffering in the system. > > Brad Beckmann wrote: > The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. > Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 > ports to inherit from a different base port is a very signficant change with > many, many implications beyond the Ruby Tester. That is a pretty > unreasonable request. > > Please keep in mind that I was not the one who decided RubyPort to use > QueuedPorts. That decision was made back in 2012 with patches from your group > 8914:8c3bd7bea667, and 8922:17f037ad8918. > > Jason Power wrote: > Is the queued port really supposed to be a high-fidelity model of > anything in the "real" world? I was under the impression that it was a > courtesy wrapper around the port so that we don't have 10 different > implementations of the same general thing. > > IMO, if you are trying to model an on-chip network at high-fidelity you > need something more than what is provided by this model anyway. It seems to > me that in this case, the queued port is a gem5 interface decision, not a > hardware system design decision. Therefore, infinite buffering is not a major > problem. > > What if this was turned into a "warn" or a "warn_once"? It seems like a > good idea to let the user know that the queue is growing wildly as that might > be an indication of a deadlock. > > (Note: IIRC with our high-bandwidth GPU model we've run into this 100 > packet limit as well.) > > Andreas Hansson wrote: > I think: > > 1) The 100 packet sanity check should definitely not be changed as part > of this patch. If we do want to change it, it should be a patch on its own. > > 2) If there are somehow 10000 responses queued up, then something is > truly _fundamentally_ broken, and having a panic telling the user it is > broken is helpful imho. Do you not agree? > > The QueuedPort is an evolution of the SimpleTimingPort that was used > earlier, and I believe the RubyPort has always had this issue (it borrowed a > lot of not-so-great design decisions from the Bus for example). The > QueuedPort should really only be used as a "lazy" way of implementing > non-performance critical devices. I've been meaning to remove it from the > classic Caches as well, just not gotten around to it. > > In the case of the RubyPort, I do not see the big issue of moving away > from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. > The only implication is that we need to deal with the port flow control > (retry), and pass that downstream/upstream. Since the RubyPort really does > not correspond to any real physical object, with any buffers, I think the > case is clear: It should not do any buffering behind the user's back. Do you > not think this is reasonable? > > Brad Beckmann wrote: > Yes, if there is 10000 responses queued up, then there might be something > broken. However, 100 responses can be queued up rather easily when running > with the Ruby Tester or with a GPU. That is why this patch changes the > sanity check from 100 to 10000. > > RubyPort is used by a lot of users, with many different > scenarios/protocols. Moving it away from QueuedPorts will fundamentally > change its behavior. That should be done carefully and I don't want to delay > this bug fix with a project that large. > > Would you be happy if we introduced a patch that modifies the > QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the > RubyPort can set the transmitList threshold to 10000 and the default can > still be set to 100? > > Andreas Hansson wrote: > Can we simply leave this change out of this patch? I think the right > solution here is to fix the dodgy assumptions on implicit buffering, and I'm > happy to have a go myself (given that I've done something dimilar to a range > of other modules). > > Brad Beckmann wrote: > The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work > without this change. I believe many Ruby users, especially Ruby users > evaluating GPUs, have had this type of change in their personal patch queues > for a very long time. We need to get this problem out of our personal patch > queues and into the main line. > > I appreciate your offer to redesign the RubyPort yourself, but even if > that were to happen, I would prefer to get this change happen first. A > redesign of the RubyPort will require a lot of downstream work to verify > things still work as expected. > > Why are you so against changing this somewhat arbritrary value, or at > least allowing someone to change it? As Jason pointed out, the queue port is > an infinite buffer and doesn't represent a real hardware structure. Why so > much resistance? This small change really benefits Ruby users and if we make > it configurable, it will have no impact on Classic users. > > Andreas Hansson wrote: > I fundamentally object to the change since it effectively adds an > implicit infinite buffer (in quite some places), thus changing the very > behaviour of the system and potentially masking problems. > > The RubyPort should really just be a glue-logic adaptor. Agreed? In a > system architecture, there is no such thing as a RubyPort, and there should > not be kilobytes of "magic" buffering imposed on the user. If this buffer is > needed, then we should instantiate a buffer, cache, etc. I do not understand > the statement that this is needed to make things work. If you need an > infinite buffer to make things work, then the solution is broken. Am I > missing something? > > I would propose to simply remove the QueuedPort, make it inherit from the > Master/SlavePort directly, remove the implicit buffer altogether, and simply > relay the retry to/from Ruby. Surely Ruby understands flow control...does it > not?
Getting rid of implicit infinite buffering is a noble and appropriate long-term goal, I agree. However, in the near term, we already have implicit infinite buffering, and the only thing we're debating here is at what arbitrary point do we warn the user that the use of that infinite buffer is potentially getting abusive. To me it seems obvious that the fundamental near-term problem is that we have this totally arbitrary constant embedded in the code and no way to adjust it based on different circumstances in which different values would be appropriate. I think Brad's suggestion that we make this threshold configurable and then find some way to propagate the value downward seems like the right way to go for now; we can leave the threshold at 100 for CPU-based systems, and crank it up for GPU-based systems, with maybe a third even higher thershold specifically for when we run RubyTester. - Steve ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/2776/#review6152 ----------------------------------------------------------- On May 11, 2015, 3:28 p.m., Tony Gutierrez wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/2776/ > ----------------------------------------------------------- > > (Updated May 11, 2015, 3:28 p.m.) > > > Review request for Default. > > > Repository: gem5 > > > Description > ------- > > Changeset 10833:e624796bae17 > --------------------------- > ruby: cleaner ruby tester support > > This patch allows the ruby random tester to use ruby ports that may only > support instr or data requests. This patch is similar to a previous changeset > (8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets. > This current patch implements the support in a more straight-forward way. > The patch also includes better DPRINTFs and generalizes the retry behavior > needed by the ruby tester so that other testers/cpu models can use it as well. > > > Diffs > ----- > > configs/example/ruby_random_test.py > fbdaa08aaa426b9f4660c366f934ccb670d954ec > configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec > configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec > configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec > configs/ruby/MOESI_CMP_directory.py > fbdaa08aaa426b9f4660c366f934ccb670d954ec > configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec > configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/cpu/testers/rubytest/CheckTable.cc > fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/cpu/testers/rubytest/RubyTester.hh > fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/cpu/testers/rubytest/RubyTester.cc > fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/cpu/testers/rubytest/RubyTester.py > fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec > > Diff: http://reviews.gem5.org/r/2776/diff/ > > > Testing > ------- > > > Thanks, > > Tony Gutierrez > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
