> On May 11, 2015, 10:45 p.m., Andreas Hansson wrote:
> > src/mem/packet_queue.cc, line 117
> > <http://reviews.gem5.org/r/2776/diff/1/?file=45138#file45138line117>
> >
> >     I strongly disagree with this change. This buffer should not exist, and 
> > even 100 packets is a stretch. Any module hitting this needs to recondiser 
> > how it deals with flow control
> 
> Brad Beckmann wrote:
>     I would agree with you that the buffer should not exist, but removing the 
> buffer goes well beyond this patch.  The worst part about the buffer is the 
> size is not visible to the sender.  It is a really bad design, but for now we 
> need to get around the immediate problem.  Later we can discuss how and who 
> will fix this right.
>     
>     We've discussed the proposal in the past to increase the size, but 
> unfortunately we have not came to a resolution.  We absolutely need to 
> resolve this now.  We cannot use the ruby tester with our upcoming GPU model 
> without this change.
>     
>     The important thing to keep in mind is that the RubyTester is designed to 
> stress the protocol logic and it creates contention scenarios that would not 
> exist in a real system.  The RubyTester finds bugs in the matter of seconds 
> that may not be encountered in hours of real workload simulation.  It does 
> this by sending large bursts of racey requests.  Bottomline: the RubyTester 
> needs this large value.
> 
> Andreas Hansson wrote:
>     I would argue that if you need flow control, you should not use the 
> QueuedPorts, and rather use a "bare-metal" MasterPort, and deal with the flow 
> control. There is no point in adding flow control (or buffer visibility) to 
> the QueuePort in my opinion.
>     
>     I'd suggest to switch to a MasterPort either as part of this patch, or a 
> patch before it. That way you have no implicit buffer, and no need to create 
> kilobytes of invisible buffering in the system.
> 
> Brad Beckmann wrote:
>     The RubyPort and the Ruby Tester predate MasterPorts and QueuedPorts.  
> Your suggestion goes far beyond this patch.  Reimplementing RubyPort's m5 
> ports to inherit from a different base port is a very signficant change with 
> many, many implications beyond the Ruby Tester.  That is a pretty 
> unreasonable request.
>     
>     Please keep in mind that I was not the one who decided RubyPort to use 
> QueuedPorts. That decision was made back in 2012 with patches from your group 
> 8914:8c3bd7bea667, and 8922:17f037ad8918.
> 
> Jason Power wrote:
>     Is the queued port really supposed to be a high-fidelity model of 
> anything in the "real" world? I was under the impression that it was a 
> courtesy wrapper around the port so that we don't have 10 different 
> implementations of the same general thing.
>     
>     IMO, if you are trying to model an on-chip network at high-fidelity you 
> need something more than what is provided by this model anyway. It seems to 
> me that in this case, the queued port is a gem5 interface decision, not a 
> hardware system design decision. Therefore, infinite buffering is not a major 
> problem.
>     
>     What if this was turned into a "warn" or a "warn_once"? It seems like a 
> good idea to let the user know that the queue is growing wildly as that might 
> be an indication of a deadlock.
>     
>     (Note: IIRC with our high-bandwidth GPU model we've run into this 100 
> packet limit as well.)
> 
> Andreas Hansson wrote:
>     I think:
>     
>     1) The 100 packet sanity check should definitely not be changed as part 
> of this patch. If we do want to change it, it should be a patch on its own.
>     
>     2) If there are somehow 10000 responses queued up, then something is 
> truly _fundamentally_ broken, and having a panic telling the user it is 
> broken is helpful imho. Do you not agree?
>     
>     The QueuedPort is an evolution of the SimpleTimingPort that was used 
> earlier, and I believe the RubyPort has always had this issue (it borrowed a 
> lot of not-so-great design decisions from the Bus for example). The 
> QueuedPort should really only be used as a "lazy" way of implementing 
> non-performance critical devices. I've been meaning to remove it from the 
> classic Caches as well, just not gotten around to it.
>     
>     In the case of the RubyPort, I do not see the big issue of moving away 
> from QueuedPorts. We have already done so in the Bridge, SimpleMemory etc. 
> The only implication is that we need to deal with the port flow control 
> (retry), and pass that downstream/upstream. Since the RubyPort really does 
> not correspond to any real physical object, with any buffers, I think the 
> case is clear: It should not do any buffering behind the user's back. Do you 
> not think this is reasonable?
> 
> Brad Beckmann wrote:
>     Yes, if there is 10000 responses queued up, then there might be something 
> broken.  However, 100 responses can be queued up rather easily when running 
> with the Ruby Tester or with a GPU.  That is why this patch changes the 
> sanity check from 100 to 10000.
>     
>     RubyPort is used by a lot of users, with many different 
> scenarios/protocols.  Moving it away from QueuedPorts will fundamentally 
> change its behavior.  That should be done carefully and I don't want to delay 
> this bug fix with a project that large.
>     
>     Would you be happy if we introduced a patch that modifies the 
> QueuedSlavePort, RespPacketQueue, and PacketQueue constructors so that the 
> RubyPort can set the transmitList threshold to 10000 and the default can 
> still be set to 100?
> 
> Andreas Hansson wrote:
>     Can we simply leave this change out of this patch? I think the right 
> solution here is to fix the dodgy assumptions on implicit buffering, and I'm 
> happy to have a go myself (given that I've done something dimilar to a range 
> of other modules).
> 
> Brad Beckmann wrote:
>     The RubyTester and our soon-to-be-posed HSA-compatible GPU will not work 
> without this change.  I believe many Ruby users, especially Ruby users 
> evaluating GPUs, have had this type of change in their personal patch queues 
> for a very long time.  We need to get this problem out of our personal patch 
> queues and into the main line.
>     
>     I appreciate your offer to redesign the RubyPort yourself, but even if 
> that were to happen, I would prefer to get this change happen first.  A 
> redesign of the RubyPort will require a lot of downstream work to verify 
> things still work as expected.
>     
>     Why are you so against changing this somewhat arbritrary value, or at 
> least allowing someone to change it?  As Jason pointed out, the queue port is 
> an infinite buffer and doesn't represent a real hardware structure.  Why so 
> much resistance?  This small change really benefits Ruby users and if we make 
> it configurable, it will have no impact on Classic users.
> 
> Andreas Hansson wrote:
>     I fundamentally object to the change since it effectively adds an 
> implicit infinite buffer (in quite some places), thus changing the very 
> behaviour of the system and potentially masking problems.
>     
>     The RubyPort should really just be a glue-logic adaptor. Agreed? In a 
> system architecture, there is no such thing as a RubyPort, and there should 
> not be kilobytes of "magic" buffering imposed on the user. If this buffer is 
> needed, then we should instantiate a buffer, cache, etc. I do not understand 
> the statement that this is needed to make things work. If you need an 
> infinite buffer to make things work, then the solution is broken. Am I 
> missing something?
>     
>     I would propose to simply remove the QueuedPort, make it inherit from the 
> Master/SlavePort directly, remove the implicit buffer altogether, and simply 
> relay the retry to/from Ruby. Surely Ruby understands flow control...does it 
> not?
> 
> Steve Reinhardt wrote:
>     Getting rid of implicit infinite buffering is a noble and appropriate 
> long-term goal, I agree.  However, in the near term, we already have implicit 
> infinite buffering, and the only thing we're debating here is at what 
> arbitrary point do we warn the user that the use of that infinite buffer is 
> potentially getting abusive.  To me it seems obvious that the fundamental 
> near-term problem is that we have this totally arbitrary constant embedded in 
> the code and no way to adjust it based on different circumstances in which 
> different values would be appropriate.  I think Brad's suggestion that we 
> make this threshold configurable and then find some way to propagate the 
> value downward seems like the right way to go for now; we can leave the 
> threshold at 100 for CPU-based systems, and crank it up for GPU-based 
> systems, with maybe a third even higher thershold specifically for when we 
> run RubyTester.
> 
> Andreas Hansson wrote:
>     I'd rather have one arbitrary constant than three :-), and the lower the 
> better.
>     
>     Can we just proceed with this patch without this change? At least then I 
> can perhaps get an appreciation for the issue (if you can point me to an 
> example that fails). Right now I really do not understand how this could ever 
> be a problem in any sensible setup.
> 
> Joel Hestness wrote:
>     I agree with Andreas on this one. I don't feel that an "arbitrary" 
> constant should be changed to another "arbitrary" constant in a patch that is 
> trying to fix up the RubyTester. There must be configurations of the 
> RubyTester that will work without increasing the threshold to 10000, so the 
> RubyTester changes will still work for some configurations. These are largely 
> orthogonal issues.
>     
>     Split the threshold change into a separate patch where we can focus on 
> debating the merits of changes between so-called "arbitrary" buffering limits 
> (e.g. gem5-gpu doesn't require any changes to the RubyPort or QueuedPort 
> buffering thresholds).

We can certainly split this change into a separate patch.  It's true that 
10,000 is just another arbitrary value; it may make more sense to just add a 
parameter that lets us turn off this threshold check altogether rather than 
create another arbitrary constant.  Note that the RubyTester is not trying to 
be a "sensible setup", it's a stress tester that is stressing the protocol 
implementation beyond the limits of what a "sensible" configuration might be 
able to do.


- Steve


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2776/#review6152
-----------------------------------------------------------


On May 11, 2015, 3:28 p.m., Tony Gutierrez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2776/
> -----------------------------------------------------------
> 
> (Updated May 11, 2015, 3:28 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> -------
> 
> Changeset 10833:e624796bae17
> ---------------------------
> ruby: cleaner ruby tester support
> 
> This patch allows the ruby random tester to use ruby ports that may only
> support instr or data requests.  This patch is similar to a previous changeset
> (8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
> This current patch implements the support in a more straight-forward way.
> The patch also includes better DPRINTFs and generalizes the retry behavior
> needed by the ruby tester so that other testers/cpu models can use it as well.
> 
> 
> Diffs
> -----
> 
>   configs/example/ruby_random_test.py 
> fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   configs/ruby/MESI_Three_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   configs/ruby/MESI_Two_Level.py fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   configs/ruby/MI_example.py fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   configs/ruby/MOESI_CMP_directory.py 
> fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   configs/ruby/MOESI_CMP_token.py fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   configs/ruby/MOESI_hammer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   src/cpu/testers/rubytest/Check.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   src/cpu/testers/rubytest/CheckTable.cc 
> fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   src/cpu/testers/rubytest/RubyTester.hh 
> fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   src/cpu/testers/rubytest/RubyTester.cc 
> fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   src/cpu/testers/rubytest/RubyTester.py 
> fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   src/mem/packet_queue.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   src/mem/ruby/system/RubyPort.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   src/mem/ruby/system/RubyPort.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec 
>   src/mem/ruby/system/Sequencer.py fbdaa08aaa426b9f4660c366f934ccb670d954ec 
> 
> Diff: http://reviews.gem5.org/r/2776/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Tony Gutierrez
> 
>

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to