Hi guys, Quick updates on this: 1) I have a finite response buffer implementation working. I removed the QueuedSlavePort and added a response queue with reservation (Andreas' underlying suggestion). I have a question with this solution: The QueuedSlavePort prioritized responses based their scheduled response time. However, since writes have a shorter pipeline from request to response, this architecture prioritized write requests ahead of read requests received earlier, and it performs ~1-8% worse than a strict queue (what I've implemented at this point). I can make the response queue a priority queue if we want the same structure as previously, but I'm wondering if we might prefer to just have the better-performing strict queue.
2) To reflect on Andreas' specific suggestion of using unblock callbacks from the PacketQueue: Modifying the QueuedSlavePort with callbacks is ugly when trying to call the callback: The call needs to originate from PacketQueue::sendDeferredPacket(), but PacketQueue doesn't have a pointer to the owner component; The SlavePort has the pointer, so the PacketQueue would need to first callback to the port, which would call the owner component callback. The exercise getting this to work has solidified my opinion that the Queued*Ports should probably be removed from the codebase: Queues and ports are separate subcomponents of simulated components, and only the component knows how they should interact. Including a Queued*Port inside a component requires the component to manage the flow-control into the Queued*Port just as it would need to manage a standard port anyway, and hiding the queue in the port obfuscates how it is managed. Thanks! Joel On Thu, Feb 4, 2016 at 10:06 AM, Joel Hestness <[email protected]> wrote: > Hi Andreas, > Thanks for the input. I had tried adding front- and back-end queues > within the DRAMCtrl, but it became very difficult to propagate the flow > control back through the component due to the complicated implementation of > timing across different accessAndRespond() calls. I had to put this > solution on hold. > > I think your proposed solution should simplify the flow control issue, > and should have the derivative effect of making the Queued*Ports capable of > flow control. I'm a little concerned that your solution would make the > buffering very fluid, and I'm not sufficiently familiar with memory > controller microarchitecture to know if that would be realistic. I wonder > if you might have a way to do performance validation after I work through > either of these implementations. > > Thanks! > Joel > > > > On Wed, Feb 3, 2016 at 11:29 AM, Andreas Hansson <[email protected]> > wrote: > >> Hi Joel, >> >> I would suggest o keep the queued ports, but add methods to reserve >> resources, query if it has free space, and a way to register callbacks so >> that the MemObject is made aware when packets are sent. That way we can use >> the queue in the cache, memory controller etc, without having all the >> issues of the “naked” port interface, but still enforcing a bounded queue. >> >> When a packet arrives to the module we call reserve on the output port. >> Then when we actually add the packet we know that there is space. When >> request packets arrive we check if the queue is full, and if so we block >> any new requests. Then through the callback we can unblock the DRAM >> controller in this case. >> >> What do you think? >> >> Andreas >> >> From: Joel Hestness <[email protected]> >> Date: Tuesday, 2 February 2016 at 00:24 >> To: Andreas Hansson <[email protected]> >> Cc: gem5 Developer List <[email protected]> >> Subject: Follow-up: Removing QueuedSlavePort from DRAMCtrl >> >> Hi Andreas, >> I'd like to circle back on the thread about removing the >> QueuedSlavePort response queue from DRAMCtrl. I've been working to shift >> over to DRAMCtrl from the RubyMemoryController, but nearly all of my >> simulations now crash on the DRAMCtrl's response queue. Since I need the >> DRAMCtrl to work, I'll be looking into this now. However, based on my >> inspection of the code, it looks pretty non-trivial to remove the >> QueueSlavePort, so I'm hoping you can at least help me work through the >> changes. >> >> To reproduce the issue, I've put together a slim gem5 patch (attached) >> to use the memtest.py script to generate accesses. Here's the command line >> I used: >> >> % build/X86/gem5.opt --debug-flag=DRAM --outdir=$outdir >> configs/example/memtest.py -u 100 >> >> If you're still willing to take a stab at it, let me know if/how I can >> help. Otherwise, I'll start working on it. It seems the trickiest thing is >> going to be modeling the arbitrary frontendLatency and backendLatency while >> still counting all of the accesses that are in the controller when it needs >> to block back to the input queue. These latencies are currently assessed >> with scheduling in the port response queue. Any suggestions you could give >> would be appreciated. >> >> Thanks! >> Joel >> >> >> Below here is our conversation from the email thread "[gem5-dev] Review >> Request 3116: ruby: RubyMemoryControl delete requests" >> >> On Wed, Sep 23, 2015 at 3:51 PM, Andreas Hansson <[email protected] >> > wrote: >> >>> Great. Thanks Joel. >>> >>> If anything pops up on our side I’ll let you know. >>> >>> Andreas >>> >>> From: Joel Hestness <[email protected]> >>> Date: Wednesday, 23 September 2015 20:29 >>> >>> To: Andreas Hansson <[email protected]> >>> Cc: gem5 Developer List <[email protected]> >>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl >>> delete requests >>> >>> >>> >>>> I don’t think there is any big difference in our expectations, quite >>>> the contrary :-). GPUs are very important to us (and so is throughput >>>> computing in general), and we run plenty simulations with lots of >>>> memory-level parallelism from non-CPU components. Still, we haven’t run >>>> into the issue. >>>> >>> >>> Ok, cool. Thanks for the context. >>> >>> >>> If you have practical examples that run into problems let me know, and >>>> we’ll get it fixed. >>>> >>> >>> I'm having trouble assembling a practical example (with or without using >>> gem5-gpu). I'll keep you posted if I find something reasonable. >>> >>> Thanks! >>> Joel >>> >>> >>> >>>> From: Joel Hestness <[email protected]> >>>> Date: Tuesday, 22 September 2015 19:58 >>>> >>>> To: Andreas Hansson <[email protected]> >>>> Cc: gem5 Developer List <[email protected]> >>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl >>>> delete requests >>>> >>>> Hi Andreas, >>>> >>>> >>>>> If it is a real problem affecting end users I am indeed volunteering >>>>> to fix the DRAMCtrl use of QueuedSlavePort. In the classic memory system >>>>> there are enough points of regulation (LSQs, MSHR limits, crossbar layers >>>>> etc) that having a single memory channel with >100 queued up responses >>>>> waiting to be sent is extremely unlikely. Hence, until now the added >>>>> complexity has not been needed. If there is regulation on the number of >>>>> requests in Ruby, then I would argue that it is equally unlikely there…I >>>>> could be wrong. >>>>> >>>> >>>> Ok. I think a big part of the difference between our expectations is >>>> just the cores that we're modeling. AMD and gem5-gpu can model aggressive >>>> GPU cores with potential to expose, perhaps, 4-32x more memory-level >>>> parallel requests than a comparable number of multithreaded CPU cores. I >>>> feel that this difference warrants different handling of accesses in the >>>> memory controller. >>>> >>>> Joel >>>> >>>> >>>> >>>> From: Joel Hestness <[email protected]> >>>>> Date: Tuesday, 22 September 2015 17:48 >>>>> >>>>> To: Andreas Hansson <[email protected]> >>>>> Cc: gem5 Developer List <[email protected]> >>>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl >>>>> delete requests >>>>> >>>>> Hi Andreas, >>>>> >>>>> Thanks for the "ship it!" >>>>> >>>>> >>>>>> Do we really need to remove the use of QueuedSlavePort in DRAMCtrl? >>>>>> It will make the controller more complex, and I don’t want to do it “just >>>>>> in case”. >>>>>> >>>>> >>>>> Sorry, I misread your email as offering to change the DRAMCtrl. I'm >>>>> not sure who should make that change, but I think it should get done. The >>>>> memory access response path starts at the DRAMCtrl and ends at the >>>>> RubyPort. If we add control flow to the RubyPort, packets will probably >>>>> back-up more quickly on the response path back to where there are open >>>>> buffers. I expect the DRAMCtrl QueuedPort problem becomes more prevalent >>>>> as >>>>> Ruby adds flow control, unless we add a limitation on outstanding requests >>>>> to memory from directory controllers. >>>>> >>>>> How does the classic memory model deal with this? >>>>> >>>>> Joel >>>>> >>>>> >>>>> >>>>>> From: Joel Hestness <[email protected]> >>>>>> Date: Tuesday, 22 September 2015 17:30 >>>>>> To: Andreas Hansson <[email protected]> >>>>>> Cc: gem5 Developer List <[email protected]> >>>>>> >>>>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl >>>>>> delete requests >>>>>> >>>>>> Hi guys, >>>>>> Thanks for the discussion here. I had quickly tested other memory >>>>>> controllers, but hadn't connected the dots that this might be the same >>>>>> problem Brad/AMD are running into. >>>>>> >>>>>> My preference would be that we remove the QueuedSlavePort from the >>>>>> DRAMCtrls. That would at least eliminate DRAMCtrls as a potential source >>>>>> of >>>>>> the QueueSlavePort packet overflows, and would allow us to more closely >>>>>> focus on the RubyPort problem when we get to it. >>>>>> >>>>>> Can we reach resolution on this patch though? Are we okay with >>>>>> actually fixing the memory leak in mainline? >>>>>> >>>>>> Joel >>>>>> >>>>>> >>>>>> On Tue, Sep 22, 2015 at 11:19 AM, Andreas Hansson < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Brad, >>>>>>> >>>>>>> We can remove the use of QueuedSlavePort in the memory controller and >>>>>>> simply not accept requests if the response queue is full. Is this >>>>>>> needed? >>>>>>> If so we’ll make sure someone gets this in place. The only reason we >>>>>>> haven’t done it is because it hasn’t been needed. >>>>>>> >>>>>>> The use of QueuedPorts in the Ruby adapters is a whole different >>>>>>> story. I >>>>>>> think most of these can be removed and actually use flow control. I’m >>>>>>> happy to code it up, but there is such a flux at the moment that I >>>>>>> didn’t >>>>>>> want to post yet another patch changing the Ruby port. I really do >>>>>>> think >>>>>>> we should avoid having implicit buffers for 1000’s of kilobytes to >>>>>>> the >>>>>>> largest extend possible. If we really need a constructor parameter >>>>>>> to make >>>>>>> it “infinite” for some quirky Ruby use-case, then let’s do that... >>>>>>> >>>>>>> Andreas >>>>>>> >>>>>>> >>>>>>> On 22/09/2015 17:14, "gem5-dev on behalf of Beckmann, Brad" >>>>>>> <[email protected] on behalf of [email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> >From AMD's perspective, we have deprecated our usage of >>>>>>> RubyMemoryControl >>>>>>> >and we are using the new Memory Controllers with the port interface. >>>>>>> > >>>>>>> >That being said, I completely agree with Joel that the packet queue >>>>>>> >finite invisible buffer limit of 100 needs to go! As you know, we >>>>>>> tried >>>>>>> >very hard several months ago to essentially make this a infinite >>>>>>> buffer, >>>>>>> >but Andreas would not allow us to check it in. We are going to >>>>>>> post that >>>>>>> >patch again in a few weeks when we post our GPU model. Our GPU >>>>>>> model >>>>>>> >will not work unless we increase that limit. >>>>>>> > >>>>>>> >Andreas you keep arguing that if you exceed that limit, that >>>>>>> something is >>>>>>> >fundamentally broken. Please keep in mind that there are many uses >>>>>>> of >>>>>>> >gem5 beyond what you use it for. Also this is a research simulator >>>>>>> and >>>>>>> >we should not restrict ourselves to what we think is practical in >>>>>>> real >>>>>>> >hardware. Finally, the fact that the finite limit is invisible to >>>>>>> the >>>>>>> >producer is just bad software engineering. >>>>>>> > >>>>>>> >I beg you to please allow us to remove this finite invisible limit! >>>>>>> > >>>>>>> >Brad >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> >-----Original Message----- >>>>>>> >From: gem5-dev [mailto:[email protected]] On Behalf Of >>>>>>> Andreas >>>>>>> >Hansson >>>>>>> >Sent: Tuesday, September 22, 2015 6:35 AM >>>>>>> >To: Andreas Hansson; Default; Joel Hestness >>>>>>> >Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl >>>>>>> >delete requests >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> >> On Sept. 21, 2015, 8:42 a.m., Andreas Hansson wrote: >>>>>>> >> > Can we just prune the whole RubyMemoryControl rather? Has it >>>>>>> not been >>>>>>> >>deprecated long enough? >>>>>>> >> >>>>>>> >> Joel Hestness wrote: >>>>>>> >> Unless I'm overlooking something, for Ruby users, I don't see >>>>>>> other >>>>>>> >>memory controllers that are guaranteed to work. Besides >>>>>>> >>RubyMemoryControl, all others use a QueuedSlavePort for their input >>>>>>> >>queues. Given that Ruby hasn't added complete flow control, >>>>>>> PacketQueue >>>>>>> >>size restrictions can be exceeded (triggering the panic). This >>>>>>> occurs >>>>>>> >>infrequently/irregularly with aggressive GPUs in gem5-gpu, and >>>>>>> appears >>>>>>> >>difficult to fix in a systematic way. >>>>>>> >> >>>>>>> >> Regardless of the fact we've deprecated RubyMemoryControl, >>>>>>> this is >>>>>>> >>a necessary fix. >>>>>>> > >>>>>>> >No memory controller is using QueuedSlaavePort for any _input_ >>>>>>> queues. >>>>>>> >The DRAMCtrl class uses it for the response _output_ queue, that's >>>>>>> all. >>>>>>> >If that is really an issue we can move away from it and enfore an >>>>>>> upper >>>>>>> >bound on responses by not accepting new requests. That said, if we >>>>>>> hit >>>>>>> >the limit I would argue something else is fundamentally broken in >>>>>>> the >>>>>>> >system and should be addressed. >>>>>>> > >>>>>>> >In any case, the discussion whether to remove RubyMemoryControl or >>>>>>> not >>>>>>> >should be completely decoupled. >>>>>>> > >>>>>>> > >>>>>>> >- Andreas >>>>>>> >>>>>> >> >> -- >> Joel Hestness >> PhD Candidate, Computer Architecture >> Dept. of Computer Science, University of Wisconsin - Madison >> http://pages.cs.wisc.edu/~hestness/ >> IMPORTANT NOTICE: The contents of this email and any attachments are >> confidential and may also be privileged. If you are not the intended >> recipient, please notify the sender immediately and do not disclose the >> contents to any other person, use it for any purpose, or store or copy the >> information in any medium. Thank you. >> > > > > -- > Joel Hestness > PhD Candidate, Computer Architecture > Dept. of Computer Science, University of Wisconsin - Madison > http://pages.cs.wisc.edu/~hestness/ > -- Joel Hestness PhD Candidate, Computer Architecture Dept. of Computer Science, University of Wisconsin - Madison http://pages.cs.wisc.edu/~hestness/ _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
