Hi guys, I just posted a draft of my DRAMCtrl flow-control patch so you can take a look here: http://reviews.gem5.org/r/3315/
NOTE: I have a separate patch that changes Ruby's QueuedMasterPort from directories to memory controllers into a MasterPort, and it places a MessageBuffer in front of the MasterPort, so that the user can make all buffering finite within a Ruby memory hierarchy. I still need to merge this patch with gem5, before I can share it. Let me know if you'd like to see the draft there also. @Joe: > I'd be curious to see a patch of what you're proposing as I'm not sure I > really follow what you're doing. The reason I ask is because I have been > discussing an implementation with with Brad and would like to see how > similar it is to what you have. Namely it's an idea similar to what is > commonly used in hardware, where senders have tokens that correspond to > slots in the receiver queue so the reservation happens at startup. The only > communication that goes from a receiving port back to a sender is token > return. The port and queue would still be coupled and the device which owns > the Queued*Port would manage removal from the PacketQueue. In my > experience, this is a very effective mechanism for flow control and > addresses your point about transparency of the queue and its state. > The tokens removes the need for unblock callbacks, but it's the > responsibility of the receiver not to send when the queue is full or when > it has a conflicting request. There's no implementation yet, but the > simplicity and similarity to hardware techniques may prove useful. Anyway, > could you post something so I can better understand what you've described? My implementation effectively does what you're describing: The DRAMCtrl now has a finite number of buffers (i.e. tokens), and it allocates a buffer slot when a request is received (senders spend a token when the DRAMCtrl accepts a request). The only real difference is that the DRAMCtrl now implements a SlavePort with flow control consistent with the rest of gem5, so if there are no buffer slots available, the request is nacked and a retry must be sent (i.e. a token is returned). Please don't get rid of the Queued*Ports, as I think there is a simple way > to improve them to do efficient flow control. > Heh... not sure I have the time/motivation to remove the Queued*Ports myself. I've just been swapping out the Queued*Ports that break when trying to implement finite buffering in a Ruby memory hierarchy. I'll leave Queued*Ports for later fixing or removal, as appropriate. Joel ________________________________________ > From: gem5-dev <[email protected]> on behalf of Joel Hestness < > [email protected]> > Sent: Friday, February 5, 2016 12:03 PM > To: Andreas Hansson > Cc: gem5 Developer List > Subject: Re: [gem5-dev] Follow-up: Removing QueuedSlavePort from DRAMCtrl > > Hi guys, > Quick updates on this: > 1) I have a finite response buffer implementation working. I removed the > QueuedSlavePort and added a response queue with reservation (Andreas' > underlying suggestion). I have a question with this solution: The > QueuedSlavePort prioritized responses based their scheduled response time. > However, since writes have a shorter pipeline from request to response, > this architecture prioritized write requests ahead of read requests > received earlier, and it performs ~1-8% worse than a strict queue (what > I've implemented at this point). I can make the response queue a priority > queue if we want the same structure as previously, but I'm wondering if we > might prefer to just have the better-performing strict queue. > > 2) To reflect on Andreas' specific suggestion of using unblock callbacks > from the PacketQueue: Modifying the QueuedSlavePort with callbacks is ugly > when trying to call the callback: The call needs to originate from > PacketQueue::sendDeferredPacket(), but PacketQueue doesn't have a pointer > to the owner component; The SlavePort has the pointer, so the PacketQueue > would need to first callback to the port, which would call the owner > component callback. > The exercise getting this to work has solidified my opinion that the > Queued*Ports should probably be removed from the codebase: Queues and ports > are separate subcomponents of simulated components, and only the component > knows how they should interact. Including a Queued*Port inside a component > requires the component to manage the flow-control into the Queued*Port just > as it would need to manage a standard port anyway, and hiding the queue in > the port obfuscates how it is managed. > > > Thanks! > Joel > > > On Thu, Feb 4, 2016 at 10:06 AM, Joel Hestness <[email protected]> > wrote: > > > Hi Andreas, > > Thanks for the input. I had tried adding front- and back-end queues > > within the DRAMCtrl, but it became very difficult to propagate the flow > > control back through the component due to the complicated implementation > of > > timing across different accessAndRespond() calls. I had to put this > > solution on hold. > > > > I think your proposed solution should simplify the flow control issue, > > and should have the derivative effect of making the Queued*Ports capable > of > > flow control. I'm a little concerned that your solution would make the > > buffering very fluid, and I'm not sufficiently familiar with memory > > controller microarchitecture to know if that would be realistic. I wonder > > if you might have a way to do performance validation after I work through > > either of these implementations. > > > > Thanks! > > Joel > > > > > > > > On Wed, Feb 3, 2016 at 11:29 AM, Andreas Hansson < > [email protected]> > > wrote: > > > >> Hi Joel, > >> > >> I would suggest o keep the queued ports, but add methods to reserve > >> resources, query if it has free space, and a way to register callbacks > so > >> that the MemObject is made aware when packets are sent. That way we can > use > >> the queue in the cache, memory controller etc, without having all the > >> issues of the “naked” port interface, but still enforcing a bounded > queue. > >> > >> When a packet arrives to the module we call reserve on the output port. > >> Then when we actually add the packet we know that there is space. When > >> request packets arrive we check if the queue is full, and if so we block > >> any new requests. Then through the callback we can unblock the DRAM > >> controller in this case. > >> > >> What do you think? > >> > >> Andreas > >> > >> From: Joel Hestness <[email protected]> > >> Date: Tuesday, 2 February 2016 at 00:24 > >> To: Andreas Hansson <[email protected]> > >> Cc: gem5 Developer List <[email protected]> > >> Subject: Follow-up: Removing QueuedSlavePort from DRAMCtrl > >> > >> Hi Andreas, > >> I'd like to circle back on the thread about removing the > >> QueuedSlavePort response queue from DRAMCtrl. I've been working to shift > >> over to DRAMCtrl from the RubyMemoryController, but nearly all of my > >> simulations now crash on the DRAMCtrl's response queue. Since I need the > >> DRAMCtrl to work, I'll be looking into this now. However, based on my > >> inspection of the code, it looks pretty non-trivial to remove the > >> QueueSlavePort, so I'm hoping you can at least help me work through the > >> changes. > >> > >> To reproduce the issue, I've put together a slim gem5 patch (attached) > >> to use the memtest.py script to generate accesses. Here's the command > line > >> I used: > >> > >> % build/X86/gem5.opt --debug-flag=DRAM --outdir=$outdir > >> configs/example/memtest.py -u 100 > >> > >> If you're still willing to take a stab at it, let me know if/how I can > >> help. Otherwise, I'll start working on it. It seems the trickiest thing > is > >> going to be modeling the arbitrary frontendLatency and backendLatency > while > >> still counting all of the accesses that are in the controller when it > needs > >> to block back to the input queue. These latencies are currently assessed > >> with scheduling in the port response queue. Any suggestions you could > give > >> would be appreciated. > >> > >> Thanks! > >> Joel > >> > >> > >> Below here is our conversation from the email thread "[gem5-dev] Review > >> Request 3116: ruby: RubyMemoryControl delete requests" > >> > >> On Wed, Sep 23, 2015 at 3:51 PM, Andreas Hansson < > [email protected] > >> > wrote: > >> > >>> Great. Thanks Joel. > >>> > >>> If anything pops up on our side I’ll let you know. > >>> > >>> Andreas > >>> > >>> From: Joel Hestness <[email protected]> > >>> Date: Wednesday, 23 September 2015 20:29 > >>> > >>> To: Andreas Hansson <[email protected]> > >>> Cc: gem5 Developer List <[email protected]> > >>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl > >>> delete requests > >>> > >>> > >>> > >>>> I don’t think there is any big difference in our expectations, quite > >>>> the contrary :-). GPUs are very important to us (and so is throughput > >>>> computing in general), and we run plenty simulations with lots of > >>>> memory-level parallelism from non-CPU components. Still, we haven’t > run > >>>> into the issue. > >>>> > >>> > >>> Ok, cool. Thanks for the context. > >>> > >>> > >>> If you have practical examples that run into problems let me know, and > >>>> we’ll get it fixed. > >>>> > >>> > >>> I'm having trouble assembling a practical example (with or without > using > >>> gem5-gpu). I'll keep you posted if I find something reasonable. > >>> > >>> Thanks! > >>> Joel > >>> > >>> > >>> > >>>> From: Joel Hestness <[email protected]> > >>>> Date: Tuesday, 22 September 2015 19:58 > >>>> > >>>> To: Andreas Hansson <[email protected]> > >>>> Cc: gem5 Developer List <[email protected]> > >>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl > >>>> delete requests > >>>> > >>>> Hi Andreas, > >>>> > >>>> > >>>>> If it is a real problem affecting end users I am indeed volunteering > >>>>> to fix the DRAMCtrl use of QueuedSlavePort. In the classic memory > system > >>>>> there are enough points of regulation (LSQs, MSHR limits, crossbar > layers > >>>>> etc) that having a single memory channel with >100 queued up > responses > >>>>> waiting to be sent is extremely unlikely. Hence, until now the added > >>>>> complexity has not been needed. If there is regulation on the number > of > >>>>> requests in Ruby, then I would argue that it is equally unlikely > there…I > >>>>> could be wrong. > >>>>> > >>>> > >>>> Ok. I think a big part of the difference between our expectations is > >>>> just the cores that we're modeling. AMD and gem5-gpu can model > aggressive > >>>> GPU cores with potential to expose, perhaps, 4-32x more memory-level > >>>> parallel requests than a comparable number of multithreaded CPU > cores. I > >>>> feel that this difference warrants different handling of accesses in > the > >>>> memory controller. > >>>> > >>>> Joel > >>>> > >>>> > >>>> > >>>> From: Joel Hestness <[email protected]> > >>>>> Date: Tuesday, 22 September 2015 17:48 > >>>>> > >>>>> To: Andreas Hansson <[email protected]> > >>>>> Cc: gem5 Developer List <[email protected]> > >>>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl > >>>>> delete requests > >>>>> > >>>>> Hi Andreas, > >>>>> > >>>>> Thanks for the "ship it!" > >>>>> > >>>>> > >>>>>> Do we really need to remove the use of QueuedSlavePort in DRAMCtrl? > >>>>>> It will make the controller more complex, and I don’t want to do it > “just > >>>>>> in case”. > >>>>>> > >>>>> > >>>>> Sorry, I misread your email as offering to change the DRAMCtrl. I'm > >>>>> not sure who should make that change, but I think it should get > done. The > >>>>> memory access response path starts at the DRAMCtrl and ends at the > >>>>> RubyPort. If we add control flow to the RubyPort, packets will > probably > >>>>> back-up more quickly on the response path back to where there are > open > >>>>> buffers. I expect the DRAMCtrl QueuedPort problem becomes more > prevalent as > >>>>> Ruby adds flow control, unless we add a limitation on outstanding > requests > >>>>> to memory from directory controllers. > >>>>> > >>>>> How does the classic memory model deal with this? > >>>>> > >>>>> Joel > >>>>> > >>>>> > >>>>> > >>>>>> From: Joel Hestness <[email protected]> > >>>>>> Date: Tuesday, 22 September 2015 17:30 > >>>>>> To: Andreas Hansson <[email protected]> > >>>>>> Cc: gem5 Developer List <[email protected]> > >>>>>> > >>>>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl > >>>>>> delete requests > >>>>>> > >>>>>> Hi guys, > >>>>>> Thanks for the discussion here. I had quickly tested other memory > >>>>>> controllers, but hadn't connected the dots that this might be the > same > >>>>>> problem Brad/AMD are running into. > >>>>>> > >>>>>> My preference would be that we remove the QueuedSlavePort from the > >>>>>> DRAMCtrls. That would at least eliminate DRAMCtrls as a potential > source of > >>>>>> the QueueSlavePort packet overflows, and would allow us to more > closely > >>>>>> focus on the RubyPort problem when we get to it. > >>>>>> > >>>>>> Can we reach resolution on this patch though? Are we okay with > >>>>>> actually fixing the memory leak in mainline? > >>>>>> > >>>>>> Joel > >>>>>> > >>>>>> > >>>>>> On Tue, Sep 22, 2015 at 11:19 AM, Andreas Hansson < > >>>>>> [email protected]> wrote: > >>>>>> > >>>>>>> Hi Brad, > >>>>>>> > >>>>>>> We can remove the use of QueuedSlavePort in the memory controller > and > >>>>>>> simply not accept requests if the response queue is full. Is this > >>>>>>> needed? > >>>>>>> If so we’ll make sure someone gets this in place. The only reason > we > >>>>>>> haven’t done it is because it hasn’t been needed. > >>>>>>> > >>>>>>> The use of QueuedPorts in the Ruby adapters is a whole different > >>>>>>> story. I > >>>>>>> think most of these can be removed and actually use flow control. > I’m > >>>>>>> happy to code it up, but there is such a flux at the moment that I > >>>>>>> didn’t > >>>>>>> want to post yet another patch changing the Ruby port. I really do > >>>>>>> think > >>>>>>> we should avoid having implicit buffers for 1000’s of kilobytes to > >>>>>>> the > >>>>>>> largest extend possible. If we really need a constructor parameter > >>>>>>> to make > >>>>>>> it “infinite” for some quirky Ruby use-case, then let’s do that... > >>>>>>> > >>>>>>> Andreas > >>>>>>> > >>>>>>> > >>>>>>> On 22/09/2015 17:14, "gem5-dev on behalf of Beckmann, Brad" > >>>>>>> <[email protected] on behalf of [email protected]> > >>>>>>> wrote: > >>>>>>> > >>>>>>> >From AMD's perspective, we have deprecated our usage of > >>>>>>> RubyMemoryControl > >>>>>>> >and we are using the new Memory Controllers with the port > interface. > >>>>>>> > > >>>>>>> >That being said, I completely agree with Joel that the packet > queue > >>>>>>> >finite invisible buffer limit of 100 needs to go! As you know, we > >>>>>>> tried > >>>>>>> >very hard several months ago to essentially make this a infinite > >>>>>>> buffer, > >>>>>>> >but Andreas would not allow us to check it in. We are going to > >>>>>>> post that > >>>>>>> >patch again in a few weeks when we post our GPU model. Our GPU > >>>>>>> model > >>>>>>> >will not work unless we increase that limit. > >>>>>>> > > >>>>>>> >Andreas you keep arguing that if you exceed that limit, that > >>>>>>> something is > >>>>>>> >fundamentally broken. Please keep in mind that there are many > uses > >>>>>>> of > >>>>>>> >gem5 beyond what you use it for. Also this is a research > simulator > >>>>>>> and > >>>>>>> >we should not restrict ourselves to what we think is practical in > >>>>>>> real > >>>>>>> >hardware. Finally, the fact that the finite limit is invisible to > >>>>>>> the > >>>>>>> >producer is just bad software engineering. > >>>>>>> > > >>>>>>> >I beg you to please allow us to remove this finite invisible > limit! > >>>>>>> > > >>>>>>> >Brad > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> >-----Original Message----- > >>>>>>> >From: gem5-dev [mailto:[email protected]] On Behalf Of > >>>>>>> Andreas > >>>>>>> >Hansson > >>>>>>> >Sent: Tuesday, September 22, 2015 6:35 AM > >>>>>>> >To: Andreas Hansson; Default; Joel Hestness > >>>>>>> >Subject: Re: [gem5-dev] Review Request 3116: ruby: > RubyMemoryControl > >>>>>>> >delete requests > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> >> On Sept. 21, 2015, 8:42 a.m., Andreas Hansson wrote: > >>>>>>> >> > Can we just prune the whole RubyMemoryControl rather? Has it > >>>>>>> not been > >>>>>>> >>deprecated long enough? > >>>>>>> >> > >>>>>>> >> Joel Hestness wrote: > >>>>>>> >> Unless I'm overlooking something, for Ruby users, I don't > see > >>>>>>> other > >>>>>>> >>memory controllers that are guaranteed to work. Besides > >>>>>>> >>RubyMemoryControl, all others use a QueuedSlavePort for their > input > >>>>>>> >>queues. Given that Ruby hasn't added complete flow control, > >>>>>>> PacketQueue > >>>>>>> >>size restrictions can be exceeded (triggering the panic). This > >>>>>>> occurs > >>>>>>> >>infrequently/irregularly with aggressive GPUs in gem5-gpu, and > >>>>>>> appears > >>>>>>> >>difficult to fix in a systematic way. > >>>>>>> >> > >>>>>>> >> Regardless of the fact we've deprecated RubyMemoryControl, > >>>>>>> this is > >>>>>>> >>a necessary fix. > >>>>>>> > > >>>>>>> >No memory controller is using QueuedSlaavePort for any _input_ > >>>>>>> queues. > >>>>>>> >The DRAMCtrl class uses it for the response _output_ queue, that's > >>>>>>> all. > >>>>>>> >If that is really an issue we can move away from it and enfore an > >>>>>>> upper > >>>>>>> >bound on responses by not accepting new requests. That said, if we > >>>>>>> hit > >>>>>>> >the limit I would argue something else is fundamentally broken in > >>>>>>> the > >>>>>>> >system and should be addressed. > >>>>>>> > > >>>>>>> >In any case, the discussion whether to remove RubyMemoryControl or > >>>>>>> not > >>>>>>> >should be completely decoupled. > >>>>>>> > > >>>>>>> > > >>>>>>> >- Andreas > >>>>>>> > >>>>>> > >> > >> -- > >> Joel Hestness > >> PhD Candidate, Computer Architecture > >> Dept. of Computer Science, University of Wisconsin - Madison > >> http://pages.cs.wisc.edu/~hestness/ > >> IMPORTANT NOTICE: The contents of this email and any attachments are > >> confidential and may also be privileged. If you are not the intended > >> recipient, please notify the sender immediately and do not disclose the > >> contents to any other person, use it for any purpose, or store or copy > the > >> information in any medium. Thank you. > >> > > > > > > > > -- > > Joel Hestness > > PhD Candidate, Computer Architecture > > Dept. of Computer Science, University of Wisconsin - Madison > > http://pages.cs.wisc.edu/~hestness/ > > > > > > -- > Joel Hestness > PhD Candidate, Computer Architecture > Dept. of Computer Science, University of Wisconsin - Madison > http://pages.cs.wisc.edu/~hestness/ > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev > -- Joel Hestness PhD Candidate, Computer Architecture Dept. of Computer Science, University of Wisconsin - Madison http://pages.cs.wisc.edu/~hestness/ _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
