Hi Joel, I don’t think there is any big difference in our expectations, quite the contrary :-). GPUs are very important to us (and so is throughput computing in general), and we run plenty simulations with lots of memory-level parallelism from non-CPU components. Still, we haven’t run into the issue.
If you have practical examples that run into problems let me know, and we’ll get it fixed. Andreas From: Joel Hestness <jthestn...@gmail.com<mailto:jthestn...@gmail.com>> Date: Tuesday, 22 September 2015 19:58 To: Andreas Hansson <andreas.hans...@arm.com<mailto:andreas.hans...@arm.com>> Cc: gem5 Developer List <gem5-dev@gem5.org<mailto:gem5-dev@gem5.org>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl delete requests Hi Andreas, If it is a real problem affecting end users I am indeed volunteering to fix the DRAMCtrl use of QueuedSlavePort. In the classic memory system there are enough points of regulation (LSQs, MSHR limits, crossbar layers etc) that having a single memory channel with >100 queued up responses waiting to be sent is extremely unlikely. Hence, until now the added complexity has not been needed. If there is regulation on the number of requests in Ruby, then I would argue that it is equally unlikely there…I could be wrong. Ok. I think a big part of the difference between our expectations is just the cores that we're modeling. AMD and gem5-gpu can model aggressive GPU cores with potential to expose, perhaps, 4-32x more memory-level parallel requests than a comparable number of multithreaded CPU cores. I feel that this difference warrants different handling of accesses in the memory controller. Joel From: Joel Hestness <jthestn...@gmail.com<mailto:jthestn...@gmail.com>> Date: Tuesday, 22 September 2015 17:48 To: Andreas Hansson <andreas.hans...@arm.com<mailto:andreas.hans...@arm.com>> Cc: gem5 Developer List <gem5-dev@gem5.org<mailto:gem5-dev@gem5.org>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl delete requests Hi Andreas, Thanks for the "ship it!" Do we really need to remove the use of QueuedSlavePort in DRAMCtrl? It will make the controller more complex, and I don’t want to do it “just in case”. Sorry, I misread your email as offering to change the DRAMCtrl. I'm not sure who should make that change, but I think it should get done. The memory access response path starts at the DRAMCtrl and ends at the RubyPort. If we add control flow to the RubyPort, packets will probably back-up more quickly on the response path back to where there are open buffers. I expect the DRAMCtrl QueuedPort problem becomes more prevalent as Ruby adds flow control, unless we add a limitation on outstanding requests to memory from directory controllers. How does the classic memory model deal with this? Joel From: Joel Hestness <jthestn...@gmail.com<mailto:jthestn...@gmail.com>> Date: Tuesday, 22 September 2015 17:30 To: Andreas Hansson <andreas.hans...@arm.com<mailto:andreas.hans...@arm.com>> Cc: gem5 Developer List <gem5-dev@gem5.org<mailto:gem5-dev@gem5.org>> Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl delete requests Hi guys, Thanks for the discussion here. I had quickly tested other memory controllers, but hadn't connected the dots that this might be the same problem Brad/AMD are running into. My preference would be that we remove the QueuedSlavePort from the DRAMCtrls. That would at least eliminate DRAMCtrls as a potential source of the QueueSlavePort packet overflows, and would allow us to more closely focus on the RubyPort problem when we get to it. Can we reach resolution on this patch though? Are we okay with actually fixing the memory leak in mainline? Joel On Tue, Sep 22, 2015 at 11:19 AM, Andreas Hansson <andreas.hans...@arm.com<mailto:andreas.hans...@arm.com>> wrote: Hi Brad, We can remove the use of QueuedSlavePort in the memory controller and simply not accept requests if the response queue is full. Is this needed? If so we’ll make sure someone gets this in place. The only reason we haven’t done it is because it hasn’t been needed. The use of QueuedPorts in the Ruby adapters is a whole different story. I think most of these can be removed and actually use flow control. I’m happy to code it up, but there is such a flux at the moment that I didn’t want to post yet another patch changing the Ruby port. I really do think we should avoid having implicit buffers for 1000’s of kilobytes to the largest extend possible. If we really need a constructor parameter to make it “infinite” for some quirky Ruby use-case, then let’s do that... Andreas On 22/09/2015 17:14, "gem5-dev on behalf of Beckmann, Brad" <gem5-dev-boun...@gem5.org<mailto:gem5-dev-boun...@gem5.org> on behalf of brad.beckm...@amd.com<mailto:brad.beckm...@amd.com>> wrote: >From AMD's perspective, we have deprecated our usage of RubyMemoryControl >and we are using the new Memory Controllers with the port interface. > >That being said, I completely agree with Joel that the packet queue >finite invisible buffer limit of 100 needs to go! As you know, we tried >very hard several months ago to essentially make this a infinite buffer, >but Andreas would not allow us to check it in. We are going to post that >patch again in a few weeks when we post our GPU model. Our GPU model >will not work unless we increase that limit. > >Andreas you keep arguing that if you exceed that limit, that something is >fundamentally broken. Please keep in mind that there are many uses of >gem5 beyond what you use it for. Also this is a research simulator and >we should not restrict ourselves to what we think is practical in real >hardware. Finally, the fact that the finite limit is invisible to the >producer is just bad software engineering. > >I beg you to please allow us to remove this finite invisible limit! > >Brad > > > >-----Original Message----- >From: gem5-dev >[mailto:gem5-dev-boun...@gem5.org<mailto:gem5-dev-boun...@gem5.org>] On Behalf >Of Andreas >Hansson >Sent: Tuesday, September 22, 2015 6:35 AM >To: Andreas Hansson; Default; Joel Hestness >Subject: Re: [gem5-dev] Review Request 3116: ruby: RubyMemoryControl >delete requests > > > >> On Sept. 21, 2015, 8:42 a.m., Andreas Hansson wrote: >> > Can we just prune the whole RubyMemoryControl rather? Has it not been >>deprecated long enough? >> >> Joel Hestness wrote: >> Unless I'm overlooking something, for Ruby users, I don't see other >>memory controllers that are guaranteed to work. Besides >>RubyMemoryControl, all others use a QueuedSlavePort for their input >>queues. Given that Ruby hasn't added complete flow control, PacketQueue >>size restrictions can be exceeded (triggering the panic). This occurs >>infrequently/irregularly with aggressive GPUs in gem5-gpu, and appears >>difficult to fix in a systematic way. >> >> Regardless of the fact we've deprecated RubyMemoryControl, this is >>a necessary fix. > >No memory controller is using QueuedSlaavePort for any _input_ queues. >The DRAMCtrl class uses it for the response _output_ queue, that's all. >If that is really an issue we can move away from it and enfore an upper >bound on responses by not accepting new requests. That said, if we hit >the limit I would argue something else is fundamentally broken in the >system and should be addressed. > >In any case, the discussion whether to remove RubyMemoryControl or not >should be completely decoupled. > > >- Andreas > > >----------------------------------------------------------- >This is an automatically generated e-mail. To reply, visit: >http://reviews.gem5.org/r/3116/#review7226 >----------------------------------------------------------- > > >On Sept. 16, 2015, 6:07 p.m., Joel Hestness wrote: >> >> ----------------------------------------------------------- >> This is an automatically generated e-mail. To reply, visit: >> http://reviews.gem5.org/r/3116/ >> ----------------------------------------------------------- >> >> (Updated Sept. 16, 2015, 6:07 p.m.) >> >> >> Review request for Default. >> >> >> Repository: gem5 >> >> >> Description >> ------- >> >> Changeset 11093:b3044de6ce9c >> --------------------------- >> ruby: RubyMemoryControl delete requests >> >> Changes to the RubyMemoryControl removed the dequeue function, which >> deleted MemoryNode instances. This results in leaked MemoryNode >> instances. Correctly delete these instances. >> >> >> Diffs >> ----- >> >> src/mem/ruby/structures/RubyMemoryControl.cc 62e1504b9c64 >> >> Diff: http://reviews.gem5.org/r/3116/diff/ >> >> >> Testing >> ------- >> >> Compiled gem5.debug with --without-tcmalloc. Ran large tests with >>Valgrind. >> >> >> Thanks, >> >> Joel Hestness >> >> > >_______________________________________________ >gem5-dev mailing list >gem5-dev@gem5.org<mailto:gem5-dev@gem5.org> >http://m5sim.org/mailman/listinfo/gem5-dev >_______________________________________________ >gem5-dev mailing list >gem5-dev@gem5.org<mailto:gem5-dev@gem5.org> >http://m5sim.org/mailman/listinfo/gem5-dev ________________________________ -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. -- Joel Hestness PhD Candidate, Computer Architecture Dept. of Computer Science, University of Wisconsin - Madison http://pages.cs.wisc.edu/~hestness/ ________________________________ -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. -- Joel Hestness PhD Candidate, Computer Architecture Dept. of Computer Science, University of Wisconsin - Madison http://pages.cs.wisc.edu/~hestness/ ________________________________ -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. -- Joel Hestness PhD Candidate, Computer Architecture Dept. of Computer Science, University of Wisconsin - Madison http://pages.cs.wisc.edu/~hestness/ ________________________________ -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev