----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/2787/#review6262 -----------------------------------------------------------
From experience with the O3 CPU, this is a VERY important change for simulated CPU performance. I appreciate the effort to finally fix this. It would be nice for the Ruby and gem5-classic memory hierarchies to provide the same access interface, but I think the consistency implications of this patch need to be discussed. I'm worried that this patch seems likely to upset consistency models for cores that may have relied on Ruby to block aliased memory accesses. Specifically, if a core was blocking multiple outstanding accesses to a single line as a way to enforce consistency to data in that line (e.g. TSO), but now the accesses could be concurrently issued to Ruby, seems like it would now be the responsibility of the sequencer and maybe even the coherence protocol to ensure that those accesses remain ordered as required. Given the behavior of the O3 CPU, perhaps the classic memory hierarchy allows multiple outstanding accesses to a single line. However, it handles transient coherence states with atomic coherence updates, which make it much easier to guarantee access ordering to a single line, so I'm not clear that it exposes the same interface as this patch provides. Are you sure that all Ruby-working CPU cores and existing protocols still enforce correct consistency? - Joel Hestness On May 11, 2015, 10:22 p.m., Tony Gutierrez wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/2787/ > ----------------------------------------------------------- > > (Updated May 11, 2015, 10:22 p.m.) > > > Review request for Default. > > > Repository: gem5 > > > Description > ------- > > Changeset 10844:0848038fe1d8 > --------------------------- > ruby: Fixed pipeline squashes caused by aliased requests > > This patch was created by Bihn Pham during his internship at AMD. > > This patch fixes a very significant performance bug when using the O3 CPU > model > and Ruby. The issue was Ruby returned false when it received a request to the > same address that already has an outstanding request or when the memory is > blocked. As a result, O3 unnecessary squashed the pipeline and re-executed > instructions. This fix merges readRequestTable and writeRequestTable in > Sequencer into a single request table that keeps track of all requests and > allows multiple outstanding requests to the same address. This prevents O3 > from squashing the pipeline. > > > Diffs > ----- > > src/mem/ruby/system/Sequencer.hh fbdaa08aaa426b9f4660c366f934ccb670d954ec > src/mem/ruby/system/Sequencer.cc fbdaa08aaa426b9f4660c366f934ccb670d954ec > > Diff: http://reviews.gem5.org/r/2787/diff/ > > > Testing > ------- > > > Thanks, > > Tony Gutierrez > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
