On Tue, 19 May 2015, Joel Hestness wrote:
My first instinct is that it's a very bad idea to do GPU coalescing in
the RubyPort. The RubyPort is a thin shim that already does too many
things (and poorly in a couple cases). However, without seeing the GPU
code, I expect it would be hard for you to communicate the constraints
on where to do coalescing (e.g. checkpointing, address translation,
etc.).
Another aspect to this buffering problem is that you're changing the
L1/L0 and Sequencer clock domains back to Ruby's clock. By default,
Ruby's clock is 2GHz and most GPUs have lower frequency than this. If
the GPU's eject width is, say, 1 packet per lane per GPU cycle, of
course you're going to pile up packets within the RubyPort on the
response path. This sounds unbalanced to me, especially on a response
path.
In my opinion those changes to clock domains are incorrect. As I see it
AMD is trying to fix a problem with ruby tester by changing unrelated
code.
--
Nilay
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev