Hi guys,
  I just posted a draft of my DRAMCtrl flow-control patch so you can take a
look here: http://reviews.gem5.org/r/3315/

  NOTE: I have a separate patch that changes Ruby's QueuedMasterPort from
directories to memory controllers into a MasterPort, and it places a
MessageBuffer in front of the MasterPort, so that the user can make all
buffering finite within a Ruby memory hierarchy. I still need to merge this
patch with gem5, before I can share it. Let me know if you'd like to see
the draft there also.

@Joe:


> I'd be curious to see a patch of what you're proposing as I'm not sure I
> really follow what you're doing. The reason I ask is because I have been
> discussing an implementation with with Brad and would like to see how
> similar it is to what you have. Namely it's an idea similar to what is
> commonly used in hardware, where senders have tokens that correspond to
> slots in the receiver queue so the reservation happens at startup. The only
> communication that goes from a receiving port back to a sender is token
> return. The port and queue would still be coupled and the device which owns
> the Queued*Port would manage removal from the PacketQueue. In my
> experience, this is a very effective mechanism for flow control and
> addresses your point about transparency of the queue and its state.
> The tokens removes the need for unblock callbacks, but it's the
> responsibility of the receiver not to send when the queue is full or when
> it has a conflicting request. There's no implementation yet, but the
> simplicity and similarity to hardware techniques may prove useful. Anyway,
> could you post something so I can better understand what you've described?


My implementation effectively does what you're describing: The DRAMCtrl now
has a finite number of buffers (i.e. tokens), and it allocates a buffer
slot when a request is received (senders spend a token when the DRAMCtrl
accepts a request). The only real difference is that the DRAMCtrl now
implements a SlavePort with flow control consistent with the rest of gem5,
so if there are no buffer slots available, the request is nacked and a
retry must be sent (i.e. a token is returned).


Please don't get rid of the Queued*Ports, as I think there is a simple way
> to improve them to do efficient flow control.
>

Heh... not sure I have the time/motivation to remove the Queued*Ports
myself. I've just been swapping out the Queued*Ports that break when trying
to implement finite buffering in a Ruby memory hierarchy. I'll leave
Queued*Ports for later fixing or removal, as appropriate.


  Joel


________________________________________
> From: gem5-dev <[email protected]> on behalf of Joel Hestness <
> [email protected]>
> Sent: Friday, February 5, 2016 12:03 PM
> To: Andreas Hansson
> Cc: gem5 Developer List
> Subject: Re: [gem5-dev] Follow-up: Removing QueuedSlavePort from DRAMCtrl
>
> Hi guys,
>   Quick updates on this:
>    1) I have a finite response buffer implementation working. I removed the
> QueuedSlavePort and added a response queue with reservation (Andreas'
> underlying suggestion). I have a question with this solution: The
> QueuedSlavePort prioritized responses based their scheduled response time.
> However, since writes have a shorter pipeline from request to response,
> this architecture prioritized write requests ahead of read requests
> received earlier, and it performs ~1-8% worse than a strict queue (what
> I've implemented at this point). I can make the response queue a priority
> queue if we want the same structure as previously, but I'm wondering if we
> might prefer to just have the better-performing strict queue.
>
>    2) To reflect on Andreas' specific suggestion of using unblock callbacks
> from the PacketQueue: Modifying the QueuedSlavePort with callbacks is ugly
> when trying to call the callback: The call needs to originate from
> PacketQueue::sendDeferredPacket(), but PacketQueue doesn't have a pointer
> to the owner component; The SlavePort has the pointer, so the PacketQueue
> would need to first callback to the port, which would call the owner
> component callback.
>   The exercise getting this to work has solidified my opinion that the
> Queued*Ports should probably be removed from the codebase: Queues and ports
> are separate subcomponents of simulated components, and only the component
> knows how they should interact. Including a Queued*Port inside a component
> requires the component to manage the flow-control into the Queued*Port just
> as it would need to manage a standard port anyway, and hiding the queue in
> the port obfuscates how it is managed.
>
>
>   Thanks!
>   Joel
>
>
> On Thu, Feb 4, 2016 at 10:06 AM, Joel Hestness <[email protected]>
> wrote:
>
> > Hi Andreas,
> >   Thanks for the input. I had tried adding front- and back-end queues
> > within the DRAMCtrl, but it became very difficult to propagate the flow
> > control back through the component due to the complicated implementation
> of
> > timing across different accessAndRespond() calls. I had to put this
> > solution on hold.
> >
> >   I think your proposed solution should simplify the flow control issue,
> > and should have the derivative effect of making the Queued*Ports capable
> of
> > flow control. I'm a little concerned that your solution would make the
> > buffering very fluid, and I'm not sufficiently familiar with memory
> > controller microarchitecture to know if that would be realistic. I wonder
> > if you might have a way to do performance validation after I work through
> > either of these implementations.
> >
> >   Thanks!
> >   Joel
> >
> >
> >
> > On Wed, Feb 3, 2016 at 11:29 AM, Andreas Hansson <
> [email protected]>
> > wrote:
> >
> >> Hi Joel,
> >>
> >> I would suggest o keep the queued ports, but add methods to reserve
> >> resources, query if it has free space, and a way to register callbacks
> so
> >> that the MemObject is made aware when packets are sent. That way we can
> use
> >> the queue in the cache, memory controller etc, without having all the
> >> issues of the “naked” port interface, but still enforcing a bounded
> queue.
> >>
> >> When a packet arrives to the module we call reserve on the output port.
> >> Then when we actually add the packet we know that there is space. When
> >> request packets arrive we check if the queue is full, and if so we block
> >> any new requests. Then through the callback we can unblock the DRAM
> >> controller in this case.
> >>
> >> What do you think?
> >>
> >> Andreas
> >>
> >> From: Joel Hestness <[email protected]>
> >> Date: Tuesday, 2 February 2016 at 00:24
> >> To: Andreas Hansson <[email protected]>
> >> Cc: gem5 Developer List <[email protected]>
> >> Subject: Follow-up: Removing QueuedSlavePort from DRAMCtrl
> >>
> >> Hi Andreas,
> >>   I'd like to circle back on the thread about removing the
> >> QueuedSlavePort response queue from DRAMCtrl. I've been working to shift
> >> over to DRAMCtrl from the RubyMemoryController, but nearly all of my
> >> simulations now crash on the DRAMCtrl's response queue. Since I need the
> >> DRAMCtrl to work, I'll be looking into this now. However, based on my
> >> inspection of the code, it looks pretty non-trivial to remove the
> >> QueueSlavePort, so I'm hoping you can at least help me work through the
> >> changes.
> >>
> >>   To reproduce the issue, I've put together a slim gem5 patch (attached)
> >> to use the memtest.py script to generate accesses. Here's the command
> line
> >> I used:
> >>
> >> % build/X86/gem5.opt --debug-flag=DRAM --outdir=$outdir
> >> configs/example/memtest.py -u 100
> >>
> >>   If you're still willing to take a stab at it, let me know if/how I can
> >> help. Otherwise, I'll start working on it. It seems the trickiest thing
> is
> >> going to be modeling the arbitrary frontendLatency and backendLatency
> while
> >> still counting all of the accesses that are in the controller when it
> needs
> >> to block back to the input queue. These latencies are currently assessed
> >> with scheduling in the port response queue. Any suggestions you could
> give
> >> would be appreciated.
> >>
> >>   Thanks!
> >>   Joel
> >>
> >>
> >> Below here is our conversation from the email thread "[gem5-dev] Review
> >> Request 3116: ruby: RubyMemoryControl delete requests"
> >>
> >> On Wed, Sep 23, 2015 at 3:51 PM, Andreas Hansson <
> [email protected]
> >> > wrote:
> >>
> >>> Great. Thanks Joel.
> >>>
> >>> If anything pops up on our side I’ll let you know.
> >>>
> >>> Andreas
> >>>
> >>> From: Joel Hestness <[email protected]>
> >>> Date: Wednesday, 23 September 2015 20:29
> >>>
> >>> To: Andreas Hansson <[email protected]>
> >>> Cc: gem5 Developer List <[email protected]>
> >>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
> >>> delete requests
> >>>
> >>>
> >>>
> >>>> I don’t think there is any big difference in our expectations, quite
> >>>> the contrary :-). GPUs are very important to us (and so is throughput
> >>>> computing in general), and we run plenty simulations with lots of
> >>>> memory-level parallelism from non-CPU components. Still, we haven’t
> run
> >>>> into the issue.
> >>>>
> >>>
> >>> Ok, cool. Thanks for the context.
> >>>
> >>>
> >>> If you have practical examples that run into problems let me know, and
> >>>> we’ll get it fixed.
> >>>>
> >>>
> >>> I'm having trouble assembling a practical example (with or without
> using
> >>> gem5-gpu). I'll keep you posted if I find something reasonable.
> >>>
> >>>   Thanks!
> >>>   Joel
> >>>
> >>>
> >>>
> >>>> From: Joel Hestness <[email protected]>
> >>>> Date: Tuesday, 22 September 2015 19:58
> >>>>
> >>>> To: Andreas Hansson <[email protected]>
> >>>> Cc: gem5 Developer List <[email protected]>
> >>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
> >>>> delete requests
> >>>>
> >>>> Hi Andreas,
> >>>>
> >>>>
> >>>>> If it is a real problem affecting end users I am indeed volunteering
> >>>>> to fix the DRAMCtrl use of QueuedSlavePort. In the classic memory
> system
> >>>>> there are enough points of regulation (LSQs, MSHR limits, crossbar
> layers
> >>>>> etc) that having a single memory channel with >100 queued up
> responses
> >>>>> waiting to be sent is extremely unlikely. Hence, until now the added
> >>>>> complexity has not been needed. If there is regulation on the number
> of
> >>>>> requests in Ruby, then I would argue that it is equally unlikely
> there…I
> >>>>> could be wrong.
> >>>>>
> >>>>
> >>>> Ok. I think a big part of the difference between our expectations is
> >>>> just the cores that we're modeling. AMD and gem5-gpu can model
> aggressive
> >>>> GPU cores with potential to expose, perhaps, 4-32x more memory-level
> >>>> parallel requests than a comparable number of multithreaded CPU
> cores. I
> >>>> feel that this difference warrants different handling of accesses in
> the
> >>>> memory controller.
> >>>>
> >>>>   Joel
> >>>>
> >>>>
> >>>>
> >>>> From: Joel Hestness <[email protected]>
> >>>>> Date: Tuesday, 22 September 2015 17:48
> >>>>>
> >>>>> To: Andreas Hansson <[email protected]>
> >>>>> Cc: gem5 Developer List <[email protected]>
> >>>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
> >>>>> delete requests
> >>>>>
> >>>>> Hi Andreas,
> >>>>>
> >>>>> Thanks for the "ship it!"
> >>>>>
> >>>>>
> >>>>>> Do we really need to remove the use of QueuedSlavePort in DRAMCtrl?
> >>>>>> It will make the controller more complex, and I don’t want to do it
> “just
> >>>>>> in case”.
> >>>>>>
> >>>>>
> >>>>> Sorry, I misread your email as offering to change the DRAMCtrl. I'm
> >>>>> not sure who should make that change, but I think it should get
> done. The
> >>>>> memory access response path starts at the DRAMCtrl and ends at the
> >>>>> RubyPort. If we add control flow to the RubyPort, packets will
> probably
> >>>>> back-up more quickly on the response path back to where there are
> open
> >>>>> buffers. I expect the DRAMCtrl QueuedPort problem becomes more
> prevalent as
> >>>>> Ruby adds flow control, unless we add a limitation on outstanding
> requests
> >>>>> to memory from directory controllers.
> >>>>>
> >>>>> How does the classic memory model deal with this?
> >>>>>
> >>>>>   Joel
> >>>>>
> >>>>>
> >>>>>
> >>>>>> From: Joel Hestness <[email protected]>
> >>>>>> Date: Tuesday, 22 September 2015 17:30
> >>>>>> To: Andreas Hansson <[email protected]>
> >>>>>> Cc: gem5 Developer List <[email protected]>
> >>>>>>
> >>>>>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl
> >>>>>> delete requests
> >>>>>>
> >>>>>> Hi guys,
> >>>>>>   Thanks for the discussion here. I had quickly tested other memory
> >>>>>> controllers, but hadn't connected the dots that this might be the
> same
> >>>>>> problem Brad/AMD are running into.
> >>>>>>
> >>>>>>   My preference would be that we remove the QueuedSlavePort from the
> >>>>>> DRAMCtrls. That would at least eliminate DRAMCtrls as a potential
> source of
> >>>>>> the QueueSlavePort packet overflows, and would allow us to more
> closely
> >>>>>> focus on the RubyPort problem when we get to it.
> >>>>>>
> >>>>>>   Can we reach resolution on this patch though? Are we okay with
> >>>>>> actually fixing the memory leak in mainline?
> >>>>>>
> >>>>>>   Joel
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Sep 22, 2015 at 11:19 AM, Andreas Hansson <
> >>>>>> [email protected]> wrote:
> >>>>>>
> >>>>>>> Hi Brad,
> >>>>>>>
> >>>>>>> We can remove the use of QueuedSlavePort in the memory controller
> and
> >>>>>>> simply not accept requests if the response queue is full. Is this
> >>>>>>> needed?
> >>>>>>> If so we’ll make sure someone gets this in place. The only reason
> we
> >>>>>>> haven’t done it is because it hasn’t been needed.
> >>>>>>>
> >>>>>>> The use of QueuedPorts in the Ruby adapters is a whole different
> >>>>>>> story. I
> >>>>>>> think most of these can be removed and actually use flow control.
> I’m
> >>>>>>> happy to code it up, but there is such a flux at the moment that I
> >>>>>>> didn’t
> >>>>>>> want to post yet another patch changing the Ruby port. I really do
> >>>>>>> think
> >>>>>>> we should avoid having implicit buffers for 1000’s of kilobytes to
> >>>>>>> the
> >>>>>>> largest extend possible. If we really need a constructor parameter
> >>>>>>> to make
> >>>>>>> it “infinite” for some quirky Ruby use-case, then let’s do that...
> >>>>>>>
> >>>>>>> Andreas
> >>>>>>>
> >>>>>>>
> >>>>>>> On 22/09/2015 17:14, "gem5-dev on behalf of Beckmann, Brad"
> >>>>>>> <[email protected] on behalf of [email protected]>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> >From AMD's perspective, we have deprecated our usage of
> >>>>>>> RubyMemoryControl
> >>>>>>> >and we are using the new Memory Controllers with the port
> interface.
> >>>>>>> >
> >>>>>>> >That being said, I completely agree with Joel that the packet
> queue
> >>>>>>> >finite invisible buffer limit of 100 needs to go!  As you know, we
> >>>>>>> tried
> >>>>>>> >very hard several months ago to essentially make this a infinite
> >>>>>>> buffer,
> >>>>>>> >but Andreas would not allow us to check it in.  We are going to
> >>>>>>> post that
> >>>>>>> >patch again in a few weeks when we post our GPU model.  Our GPU
> >>>>>>> model
> >>>>>>> >will not work unless we increase that limit.
> >>>>>>> >
> >>>>>>> >Andreas you keep arguing that if you exceed that limit, that
> >>>>>>> something is
> >>>>>>> >fundamentally broken.  Please keep in mind that there are many
> uses
> >>>>>>> of
> >>>>>>> >gem5 beyond what you use it for.  Also this is a research
> simulator
> >>>>>>> and
> >>>>>>> >we should not restrict ourselves to what we think is practical in
> >>>>>>> real
> >>>>>>> >hardware.  Finally, the fact that the finite limit is invisible to
> >>>>>>> the
> >>>>>>> >producer is just bad software engineering.
> >>>>>>> >
> >>>>>>> >I beg you to please allow us to remove this finite invisible
> limit!
> >>>>>>> >
> >>>>>>> >Brad
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >-----Original Message-----
> >>>>>>> >From: gem5-dev [mailto:[email protected]] On Behalf Of
> >>>>>>> Andreas
> >>>>>>> >Hansson
> >>>>>>> >Sent: Tuesday, September 22, 2015 6:35 AM
> >>>>>>> >To: Andreas Hansson; Default; Joel Hestness
> >>>>>>> >Subject: Re: [gem5-dev] Review Request 3116: ruby:
> RubyMemoryControl
> >>>>>>> >delete requests
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >> On Sept. 21, 2015, 8:42 a.m., Andreas Hansson wrote:
> >>>>>>> >> > Can we just prune the whole RubyMemoryControl rather? Has it
> >>>>>>> not been
> >>>>>>> >>deprecated long enough?
> >>>>>>> >>
> >>>>>>> >> Joel Hestness wrote:
> >>>>>>> >>     Unless I'm overlooking something, for Ruby users, I don't
> see
> >>>>>>> other
> >>>>>>> >>memory controllers that are guaranteed to work. Besides
> >>>>>>> >>RubyMemoryControl, all others use a QueuedSlavePort for their
> input
> >>>>>>> >>queues. Given that Ruby hasn't added complete flow control,
> >>>>>>> PacketQueue
> >>>>>>> >>size restrictions can be exceeded (triggering the panic). This
> >>>>>>> occurs
> >>>>>>> >>infrequently/irregularly with aggressive GPUs in gem5-gpu, and
> >>>>>>> appears
> >>>>>>> >>difficult to fix in a systematic way.
> >>>>>>> >>
> >>>>>>> >>     Regardless of the fact we've deprecated RubyMemoryControl,
> >>>>>>> this is
> >>>>>>> >>a necessary fix.
> >>>>>>> >
> >>>>>>> >No memory controller is using QueuedSlaavePort for any _input_
> >>>>>>> queues.
> >>>>>>> >The DRAMCtrl class uses it for the response _output_ queue, that's
> >>>>>>> all.
> >>>>>>> >If that is really an issue we can move away from it and enfore an
> >>>>>>> upper
> >>>>>>> >bound on responses by not accepting new requests. That said, if we
> >>>>>>> hit
> >>>>>>> >the limit I would argue something else is fundamentally broken in
> >>>>>>> the
> >>>>>>> >system and should be addressed.
> >>>>>>> >
> >>>>>>> >In any case, the discussion whether to remove RubyMemoryControl or
> >>>>>>> not
> >>>>>>> >should be completely decoupled.
> >>>>>>> >
> >>>>>>> >
> >>>>>>> >- Andreas
> >>>>>>>
> >>>>>>
> >>
> >> --
> >>   Joel Hestness
> >>   PhD Candidate, Computer Architecture
> >>   Dept. of Computer Science, University of Wisconsin - Madison
> >>   http://pages.cs.wisc.edu/~hestness/
> >> IMPORTANT NOTICE: The contents of this email and any attachments are
> >> confidential and may also be privileged. If you are not the intended
> >> recipient, please notify the sender immediately and do not disclose the
> >> contents to any other person, use it for any purpose, or store or copy
> the
> >> information in any medium. Thank you.
> >>
> >
> >
> >
> > --
> >   Joel Hestness
> >   PhD Candidate, Computer Architecture
> >   Dept. of Computer Science, University of Wisconsin - Madison
> >   http://pages.cs.wisc.edu/~hestness/
> >
>
>
>
> --
>   Joel Hestness
>   PhD Candidate, Computer Architecture
>   Dept. of Computer Science, University of Wisconsin - Madison
>   http://pages.cs.wisc.edu/~hestness/
> _______________________________________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/gem5-dev
>


-- 
  Joel Hestness
  PhD Candidate, Computer Architecture
  Dept. of Computer Science, University of Wisconsin - Madison
  http://pages.cs.wisc.edu/~hestness/
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to