-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3773/#review9318
-----------------------------------------------------------

Ship it!


Ship It!

- Brad Beckmann


On Jan. 25, 2017, 6:49 a.m., Joel Hestness wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/3773/
> -----------------------------------------------------------
> 
> (Updated Jan. 25, 2017, 6:49 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> -------
> 
> Changeset 11802:ca5c5b982ea5
> ---------------------------
> ruby: PerfectSwitch add assured access arbitration
> 
> When operating near bandwidth saturation and using finite cache hierarchy
> buffering, the round-robin arbitration in the PerfectSwitch caused low ID
> input buffers to gain access to the switch more frequently than other input
> buffers that might contain requests. This resulted from the priority cycling
> starting on input buffers with no pending requests and cycling around to the
> low ID buffers with pending requests. Part of the problem was that
> input-to-output port allocation was done on-the-fly while cycling through
> input ports.
> 
> To fix this, refactor the PerfectSwitch to remove on-the-fly arbitration, and
> better delineate port allocation from switch traversal. Then, implement
> cycling-priority assured access arbitration using output port request batches
> to ensure that all input ports are given the same priority when buffers are
> full.
> 
> This fix reduces GPU core progress asymmetry from >3x down to <12%, and in
> line with hardware.
> 
> 
> Diffs
> -----
> 
>   src/mem/ruby/network/simple/PerfectSwitch.hh cd7f3a1dbf55 
>   src/mem/ruby/network/simple/PerfectSwitch.cc cd7f3a1dbf55 
> 
> Diff: http://reviews.gem5.org/r/3773/diff/
> 
> 
> Testing
> -------
> 
> Extensive testing and use in gem5-gpu. Used GPU to saturate cache hierarchy
> bandwidth, and tracked threadblock progress to witness asymmetry. Repeated
> this testing after the fix to see greatly reduced asymmetry. Also, in these
> small tests, simulator run time improves slightly due to reduced amount of
> work performed by PerfectSwitch arbitration. Also, have run thousands of
> simulations with this patch to verify that the changes work for a wide
> range of simulated system behaviors.
> 
> 
> Thanks,
> 
> Joel Hestness
> 
>

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to