----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/3773/#review9318 -----------------------------------------------------------
Ship it! Ship It! - Brad Beckmann On Jan. 25, 2017, 6:49 a.m., Joel Hestness wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/3773/ > ----------------------------------------------------------- > > (Updated Jan. 25, 2017, 6:49 a.m.) > > > Review request for Default. > > > Repository: gem5 > > > Description > ------- > > Changeset 11802:ca5c5b982ea5 > --------------------------- > ruby: PerfectSwitch add assured access arbitration > > When operating near bandwidth saturation and using finite cache hierarchy > buffering, the round-robin arbitration in the PerfectSwitch caused low ID > input buffers to gain access to the switch more frequently than other input > buffers that might contain requests. This resulted from the priority cycling > starting on input buffers with no pending requests and cycling around to the > low ID buffers with pending requests. Part of the problem was that > input-to-output port allocation was done on-the-fly while cycling through > input ports. > > To fix this, refactor the PerfectSwitch to remove on-the-fly arbitration, and > better delineate port allocation from switch traversal. Then, implement > cycling-priority assured access arbitration using output port request batches > to ensure that all input ports are given the same priority when buffers are > full. > > This fix reduces GPU core progress asymmetry from >3x down to <12%, and in > line with hardware. > > > Diffs > ----- > > src/mem/ruby/network/simple/PerfectSwitch.hh cd7f3a1dbf55 > src/mem/ruby/network/simple/PerfectSwitch.cc cd7f3a1dbf55 > > Diff: http://reviews.gem5.org/r/3773/diff/ > > > Testing > ------- > > Extensive testing and use in gem5-gpu. Used GPU to saturate cache hierarchy > bandwidth, and tracked threadblock progress to witness asymmetry. Repeated > this testing after the fix to see greatly reduced asymmetry. Also, in these > small tests, simulator run time improves slightly due to reduced amount of > work performed by PerfectSwitch arbitration. Also, have run thousands of > simulations with this patch to verify that the changes work for a wide > range of simulated system behaviors. > > > Thanks, > > Joel Hestness > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
