> On May 12, 2015, 5:45 a.m., Andreas Hansson wrote: > > src/mem/packet_queue.cc, line 117 > > <http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117> > > > > I strongly disagree with this change. This buffer should not exist, and > > even 100 packets is a stretch. Any module hitting this needs to recondiser > > how it deals with flow control > > Brad Beckmann wrote: > I would agree with you that the buffer should not exist, but removing the > buffer goes well beyond this patch. The worst part about the buffer is the > size is not visible to the sender. It is a really bad design, but for now we > need to get around the immediate problem. Later we can discuss how and who > will fix this right. > > We've discussed the proposal in the past to increase the size, but > unfortunately we have not came to a resolution. We absolutely need to > resolve this now. We cannot use the ruby tester with our upcoming GPU model > without this change. > > The important thing to keep in mind is that the RubyTester is designed to > stress the protocol logic and it creates contention scenarios that would not > exist in a real system. The RubyTester finds bugs in the matter of seconds > that may not be encountered in hours of real workload simulation. It does > this by sending large bursts of racey requests. Bottomline: the RubyTester > needs this large value. > > Andreas Hansson wrote: > I would argue that if you need flow control, you should not use the > QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow > control. There is no point in adding flow control (or buffer visibility) to > the QueuePort in my opinion. > > I'd suggest to switch to a MasterPort either as part of this patch, or a > patch before it. That way you have no implicit buffer, and no need to create > kilobytes of invisible buffering in the system. > > Brad Beckmann wrote: > The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts. > Your suggestion goes far beyond this patch. Reimplementing RubyPort's m5 > ports to inherit from a different base port is a very signficant change with > many, many implications beyond the Ruby Tester. That is a pretty > unreasonable request. > > Please keep in mind that I was not the one who decided RubyPort to use > QueuedPorts. That decision was made back in 2012 with patches from your group > 8914:8c3bd7bea667, and 8922:17f037ad8918. > > Jason Power wrote: > Is the queued port really supposed to be a high-fidelity model of > anything in the "real" world? I was under the impression that it was a > courtesy wrapper around the port so that we don't have 10 different > implementations of the same general thing. > > IMO, if you are trying to model an on-chip network at high-fidelity you > need something more than what is provided by this model anyway. It seems to > me that in this case, the queued port is a gem5 interface decision, not a > hardware system design decision. Therefore, infinite buffering is not a major > problem. > > What if this was turned into a "warn" or a "warn_once"? It seems like a > good idea to let the user know that the queue is growing wildly as that might > be an indication of a deadlock. > > (Note: IIRC with our high-bandwidth GPU model we've run into this 100 > packet limit as well.) > > Andreas Hansson wrote: > I think: > > 1) The 100 packet sanity check should definitely not be changed as part > of this patch. If we do want to change it, it should be a patch on its own. > > 2) If there are somehow 10000 responses queued up, then something is > truly _fundamentally_ broken, and having a panic telling the user it is > broken is helpful imho. Do you not agree? > > The QueuedPort is an evolution of the SimpleTimingPort that was used > earlier, and I believe the RubyPort has always had this issue (it borrowed a > lot of not-so-great design decisions from the Bus for example). The > QueuedPort should really only be used as a "lazy" way of implementing > non-performance critical devices. I've been meaning to remove it from the > classic Caches as well, just not gotten around to it. > > In the case of the RubyPort, I do not see the big issue of moving away > from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. > The only implication is that we need to deal with the port flow control > (retry), and pass that downstream/upstream. Since the RubyPort really does > not correspond to any real physical object, with any buffers, I think the > case is clear: It should not do any buffering behind the user's back. Do you > not think this is reasonable? > > Brad Beckmann wrote: > Yes, if there is 10000 responses queued up, then there might be something > broken. However, 100 responses can be queued up rather easily when running > with the Ruby Tester or with a GPU. That is why this patch changes the > sanity check from 100 to 10000. > > RubyPort is used by a lot of users, with many different > scenarios/protocols. Moving it away from QueuedPorts will fundamentally > change its behavior. That should be done carefully and I don't want to delay > this bug fix with a project that large. > > Would you be happy if we introduced a patch that modifies the > QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the > RubyPort can set the transmitList threshold to 10000 and the default can > still be set to 100? > > Andreas Hansson wrote: > Can we simply leave this change out of this patch? I think the right > solution here is to fix the dodgy assumptions on implicit buffering, and I'm > happy to have a go myself (given that I've done something dimilar to a range > of other modules). > > Brad Beckmann wrote: > The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work > without this change. I believe many Ruby users, especially Ruby users > evaluating GPUs, have had this type of change in their personal patch queues > for a very long time. We need to get this problem out of our personal patch > queues and into the main line. > > I appreciate your offer to redesign the RubyPort yourself, but even if > that were to happen, I would prefer to get this change happen first. A > redesign of the RubyPort will require a lot of downstream work to verify > things still work as expected. > > Why are you so against changing this somewhat arbritrary value, or at > least allowing someone to change it? As Jason pointed out, the queue port is > an infinite buffer and doesn't represent a real hardware structure. Why so > much resistance? This small change really benefits Ruby users and if we make > it configurable, it will have no impact on Classic users. > > Andreas Hansson wrote: > I fundamentally object to the change since it effectively adds an > implicit infinite buffer (in quite some places), thus changing the very > behaviour of the system and potentially masking problems. > > The RubyPort should really just be a glue-logic adaptor. Agreed? In a > system architecture, there is no such thing as a RubyPort, and there should > not be kilobytes of "magic" buffering imposed on the user. If this buffer is > needed, then we should instantiate a buffer, cache, etc. I do not understand > the statement that this is needed to make things work. If you need an > infinite buffer to make things work, then the solution is broken. Am I > missing something? > > I would propose to simply remove the QueuedPort, make it inherit from the > Master/SlavePort directly, remove the implicit buffer altogether, and simply > relay the retry to/from Ruby. Surely Ruby understands flow control...does it > not? > > Steve Reinhardt wrote: > Getting rid of implicit infinite buffering is a noble and appropriate > long-term goal, I agree. However, in the near term, we already have implicit > infinite buffering, and the only thing we're debating here is at what > arbitrary point do we warn the user that the use of that infinite buffer is > potentially getting abusive. To me it seems obvious that the fundamental > near-term problem is that we have this totally arbitrary constant embedded in > the code and no way to adjust it based on different circumstances in which > different values would be appropriate. I think Brad's suggestion that we > make this threshold configurable and then find some way to propagate the > value downward seems like the right way to go for now; we can leave the > threshold at 100 for CPU-based systems, and crank it up for GPU-based > systems, with maybe a third even higher thershold specifically for when we > run RubyTester.
I'd rather have one arbitrary constant than three :-), and the lower the better. Can we just proceed with this patch without this change? At least then I can perhaps get an appreciation for the issue (if you can point me to an example that fails). Right now I really do not understand how this could ever be a problem in any sensible setup. - Andreas ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/2776/#review6152 ----------------------------------------------------------- On May 11, 2015, 10:28 p.m., Tony Gutierrez wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/2776/ > ----------------------------------------------------------- > > (Updated May 11, 2015, 10:28 p.m.) > > > Review request for Default. > > > Repository: gem5 > > > Description > ------- > > Changeset 10833:e624796bae17 > --------------------------- > ruby: cleaner ruby tester support > > This patch allows the ruby random tester to use ruby ports that may only > support instr or data requests. This patch is similar to a previous changeset > (8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets. > This current patch implements the support in a more straight-forward way. > The patch also includes better DPRINTFs and generalizes the retry behavior > needed by the ruby tester so that other testers/cpu models can use it as well. > > > Diffs > ----- > > configs/example/ruby_random_test.py > fbdaa08aaa426b9f4660c366f934ccb670d954ec > configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec > configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec > configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec > configs/ruby/MOESI_CMP_directory.py > fbdaa08aaa426b9f4660c366f934ccb670d954ec > configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec > configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/cpu/testers/rubytest/CheckTable.cc > fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/cpu/testers/rubytest/RubyTester.hh > fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/cpu/testers/rubytest/RubyTester.cc > fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/cpu/testers/rubytest/RubyTester.py > fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec > > Diff: http://reviews.gem5.org/r/2776/diff/ > > > Testing > ------- > > > Thanks, > > Tony Gutierrez > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
