> On May 29, 2015, 8:01 a.m., Joel Hestness wrote: > > I was buffering these comments, because this change has large implications > > and we still haven't discussed the differences between Ruby and classic > > memory interfaces. The patch submitter should be able to address questions > > like the ones I've raised so far (I don't feel any clarification has been > > provided yet). > > > > Once again, I did more digging: It appears that classic caches completely > > model MSHR queuing and write buffering, and they block requests from the > > core when these are full. This indicates that they accept multiple > > concurrent accesses to a single line (it would be nice to get confirmation > > on that from a classic user). Thus, it makes sense to allow the Ruby > > sequencer to accept multiple outstanding accesses to a single line > > concurrently to be consistent with classic. > > > > Overall, I have significant complaints about this patch, because it > > introduces buffering into the sequencer, which has the effect of > > implementing MSHR queuing that should probably be in the caches to be > > consistent with classic. This breaks the sequencer's thin shim abstraction > > between cores and cache controllers, appears to break no-RFO L1 caches, and > > side-steps cache resource constraints. Detailed comments are below.
To follow up on Joel's high-level question: the classic cache model does indeed combine requests to the same cache block using a finite set of MSHRs, and blocks when the MSHRs are full. Note that this is useful at all levels of the hierarchy, e.g., in a shared L2 you can end up combining requests from different L1s that happen to target the same cache block. The general rule in the classic hierarchy is that a single cache will have at most one outstanding request for a given cache block at any point in time. When a response arrives, the cache will satisfy as many of the buffered requests for the block as it can. Note that the timing is sketchy here on the classic side too; the latency of handling these (potentially multiple) requests is not modeled. It's not clear how large of an effect that is. Note that the set of requests to a given cache block are processed in order, both in the classic cache and (I believe) in this patch as well. (Binh wrote this code originally, but I spent a fair amount of time cleaning it up before it was posted the first time, though that was long enough ago I don't remember all the details off the top of my head.) If stronger ordering guarantees are required by the consistency model, it's the responsibility of the CPU to hold off on issuing requests to the cache appropriately (as Joel saw with the O3 CPU). - Steve ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/2787/#review6383 ----------------------------------------------------------- On May 26, 2015, 12:42 p.m., Tony Gutierrez wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/2787/ > ----------------------------------------------------------- > > (Updated May 26, 2015, 12:42 p.m.) > > > Review request for Default. > > > Repository: gem5 > > > Description > ------- > > Changeset 10817:d010e6a8e783 > --------------------------- > ruby: Fixed pipeline squashes caused by aliased requests > > This patch was created by Bihn Pham during his internship at AMD. > > This patch fixes a very significant performance bug when using the O3 CPU > model > and Ruby. The issue was Ruby returned false when it received a request to the > same address that already has an outstanding request or when the memory is > blocked. As a result, O3 unnecessary squashed the pipeline and re-executed > instructions. This fix merges readRequestTable and writeRequestTable in > Sequencer into a single request table that keeps track of all requests and > allows multiple outstanding requests to the same address. This prevents O3 > from squashing the pipeline. > > > Diffs > ----- > > src/mem/ruby/system/Sequencer.hh df2aa91dba5b0f0baa351039f0802baad9ed8f1d > src/mem/ruby/system/Sequencer.cc df2aa91dba5b0f0baa351039f0802baad9ed8f1d > > Diff: http://reviews.gem5.org/r/2787/diff/ > > > Testing > ------- > > > Thanks, > > Tony Gutierrez > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
