Re: [gem5-dev] Review Request 2466: ruby: provide a second copy of the memory

Joel Hestness via gem5-dev Wed, 22 Oct 2014 20:57:41 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2466/#review5425
-----------------------------------------------------------

I'd like to weigh in on this. I, too, am confused about the need for this 
change and would be very conflicted about changing/removing Ruby's 
functionally-coherent store:

tl;dr: For everyone's sake, I describe (my understanding of) the uses of Ruby's 
functionally-coherent backing store. Then explain why I feel that 
changing/removing it fails to make progress toward addressing the real issues 
with Ruby functional accesses.

I haven't seen a thorough explanation of this anywhere despite these review 
requests proposing substantial changes, so here's a write-up of my 
understanding of the Ruby functionally-coherent store:

First, so we're on the same page: Ruby can maintain different versions of data 
in the caches, and the coherence/validity of that data depends on the data's 
state and the coherence protocol's interpretation of that state. The states and 
their implied data validity can be very different across different coherence 
protocols, as can the use of cache controller queues and interconnect message 
buffers for managing state and data. Ruby's functionally-coherent data store 
(what's on the chopping block here) decouples the state+transition design of a 
protocol from the data handling by providing a place to store the known current 
(coherent) version of the data that is readily accessible from any controller.

The functionally-coherent store is very useful for developing new coherence 
protocols: When developing, the protocol author often aims to get the apparent 
states and transitions correct, but the data handling part can be a bit 
messier. For example, something as simple as forgetting to copy data from a 
cache block to an MSHR in a protocol transition makes it tricky to debug where 
incorrect data may have arisen. For this reason, it is useful to have an always 
functionally-coherent copy of the data that allows the developer to just grab 
the right data at the end of a memory access. This functionality is currently 
available by setting the RubyPort/Sequencer access_phys_mem parameter to True. 
In src/mem/ruby/system/RubyPort.cc, the RubyPort::MemSlavePort::hitCallback() 
function is called at the end of every memory access and decides based on 
access_phys_mem whether to read data from the functionally-coherent store. This 
functionality effectively decouples the development of a prot
 ocol's data correctness from the it's state-handling correctness.

While it is useful in protocol development, the functionally-coherent store can 
actually provide MUCH more (I think): truly functional memory access, which 
Ruby doesn't fully support currently. gem5 is built following a principle that 
it should support functional memory accesses as a way to allow programmers to 
functionally implement interesting new simulator behaviors before deciding 
whether they should be implemented more completely. Ruby's 
functionally-coherent store can provide this functionality by ensuring that, 
regardless of the state of data in the Ruby caches AND regardless of the 
coherence protocol, we can always access the coherent version of the data when 
access_phys_mem = True for all requesting RubyPorts (at this point, I'll note 
that for gem5-gpu, we maintain a tiny Ruby patch to get this to appear to work 
completely and I've heard through the grapevine that AMD has done something 
similar).

So, what I'm saying is that Ruby doesn't actually fully support functional 
memory accesses (but can with the backing store)... Ruby functional memory 
access support would be hard to implement, so no one has taken the plunge. Due 
to the variability in what various coherence protocols try to implement, it is 
often difficult to decide how to handle functional data accesses. For example, 
when data is being moved between caches (i.e. it resides in MSHRs, controller 
queues, or the interconnect), it is not always obvious where the current 
(coherent) version of the data resides. In such a situation, Ruby currently 
doesn't have a way to guarantee correctness of the functional access, so it 
gives up and exits simulation with a fatal() in 
RubyPort::MemSlavePort::recvFunctional() (this sucks). The code that tries to 
figure this out is in the Ruby system functionalRead or functionalWrite 
functions, which perform heavy-duty look-ups for the data across all the cache 
controllers, and these 
 look-ups can fail to return/update functionally-coherent data under many 
different conditions.

It's worth noting that since gem5's syscall emulation uses functional accesses 
in a few cases, anyone using Ruby with SE mode and those system calls is 
actually getting lucky by not running into the functional access fatal().

My (our) stake in Ruby functional accesses:

This is largely informed by my experience with functional memory accesses for 
rapid development of gem5-gpu. In gem5-gpu, we rely fairly heavily on 
functional accesses as a way to implement a slim and flexible CUDA runtime 
library. For example, when a CUDA benchmark starts executing, we functionally 
read the binary out of the simulated system's memory to hand over to GPGPU-Sim, 
which needs the GPU code to simulate the GPU cores. In the substantial majority 
of cases, these sorts of functional accesses don't trigger Ruby's functional 
access fatal(), because they are situations that Ruby can handle (e.g. reading 
from cache data in a shared state or from off-chip memory). However, we 
inevitably run into the functional access fatal() here-and-there, just because 
we're testing so many different things. Also, while we could eliminate a fair 
number of functional accesses, there are a few places in our CUDA runtime that 
would still require functional accesses. Finally, I am aware of a
 t least 3 different Ruby coherence protocols that rely on the 
functionally-coherent backing store: the primary protocol included with 
gem5-gpu ("VI_hammer"), and 2 others that have been developed by gem5-gpu users 
(note: gem5-gpu recently passed 100 total mailing list subscribers and 300+ 
downloads - Woohoo!).

My take:

Sure, I feel that it's very important for Ruby to completely support functional 
accesses going forward, which would suggest that we could eliminate the 
functionally-coherent store. However, I also feel that the ability to use a 
functionally-coherent backing store MUST stay. I suspect it will be very 
desirable to implement thin runtime interfaces for accelerators, and such 
runtimes are likely to use functional accesses. With the potential desire to 
test new coherence protocols to join these heterogeneous cores, it is likely 
that Ruby will be involved and need to handle the functional accesses. This 
suggests we should invest some effort in making Ruby functional accesses more 
robust.

Even if Ruby completely supports functional accesses, a coherence protocol 
developer should NOT be required to get data handling correct when trying to 
implement or hack on a protocol. It's hard enough to get the protocol states 
and transitions correct while making sure you're not inadvertently introducing 
race conditions. So, I feel that the functionally-coherent backing store should 
remain at least as an optional feature to ease the developer's process.

I don't feel that this patch or the proposal to remove the backing store are 
particularly mindful of either of these issues.

- Joel Hestness

On Oct. 21, 2014, 10:01 p.m., Nilay Vaish wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2466/
> -----------------------------------------------------------
> 
> (Updated Oct. 21, 2014, 10:01 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> -------
> 
> Changeset 10497:587c77ab8adc
> ---------------------------
> ruby: provide a second copy of the memory
> This memory is required in some cases.  It can be enabled by invoking
> the option --access-phys-mem
> 
> 
> Diffs
> -----
> 
>   configs/ruby/Ruby.py ffe6ab7141ab 
>   src/mem/ruby/system/DMASequencer.hh ffe6ab7141ab 
>   src/mem/ruby/system/DMASequencer.cc ffe6ab7141ab 
>   src/mem/ruby/system/RubyPort.hh ffe6ab7141ab 
>   src/mem/ruby/system/RubyPort.cc ffe6ab7141ab 
>   src/mem/ruby/system/RubySystem.py ffe6ab7141ab 
>   src/mem/ruby/system/Sequencer.py ffe6ab7141ab 
>   src/mem/ruby/system/System.hh ffe6ab7141ab 
>   src/mem/ruby/system/System.cc ffe6ab7141ab 
> 
> Diff: http://reviews.gem5.org/r/2466/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nilay Vaish
> 
>

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Review Request 2466: ruby: provide a second copy of the memory

Reply via email to