Malek, TimingSimpleCPUs are in-order CPU models and only do one instruction at a time, so coalescing at the CPU won't make any sense, since there will be nothing to coalesce. Unless you want to do your coalescing further down the memory hierarchy where you might have multiple accesses from different CPUs meeting at a shared cache (which sounds like it's not the situation you are targeting), then you'll probably have to use the O3 CPU. However, I'm not totally sure what the state of using Ruby with O3 is right now. Can anyone speak to that?
Lisa On Sun, Mar 27, 2011 at 6:19 PM, Malek Musleh <malek.mus...@gmail.com>wrote: > Hello, > > I am interested in implementing a storebuffer (coalescing buffer) for > Ruby's Memory Model in M5/GEM5 for use in my current research. > > I wish to be able to coalesce speculative stores + non-speculative > stores to the same cache line and then flush them to the cache during > certain acquire/release constructs. > > I see that there was an existing directory called storebuffer, but was > removed not too long ago. Reading the associated thread on the mailing > list it seems that it was removed because it is not in use (given that > O3 is not yet functional with Ruby), nor was never actually even used > in the original GEM implementation. > > Here is the link to that thread: > http://www.mail-archive.com/m5-dev@m5sim.org/msg10575.html > > In further reading of that thread, I see that there is/was general > consensus that the Ruby Store Buffer will be merged with M5 O3's LSQ. > > For my research, O3 CPU Model is not a requirement, although > storebuffers tend to be used typically only in O3 execution. > > For what I need to do, my specific question is as follows: > > A) Would it be better/easier to implement a new Buffer (similar to the > MessageBuffer class) from the Ruby Side > or > B) actually reuse M5's existing O3's LSQ buffer in the Timing CPU Model. > > I think that A) might be the easier method to go for the following reasons: > > 1) It seems that the Sequencer class already has functionality to > support coalescing stores to the same cache line (in reading the > previous storebuffer thread) > > 2) This would make the coalescing buffer CPU Model independent > > 3) Avoid having to change the Timing CPU Code which may make it more > likely to mess up how the CPU Model handles other memory related > things (ISA-Dependent Memory references, split data requests, > prefetching, etc). > > 4) Allows me to make it a Ruby Only change on the Ruby Code side of > things as opposed to the M5 side of things. > > However, my hesitation with this approach is because > > 1) the way the Sequencer operates, it is the interface between the CPU > Core and the Ruby Memory Model (converting M5 requests to Ruby > Requests and what not), so 'logically' I guess it might make more > sense to implement the store buffer before Ruby sees the store > requests, and just have the sequencer do its thing with the > coalescing? > > 2) The conclusion of the previous storebuffer thread was that work is > currently?/will be done implementing the store buffer on the M5 side > of things. > > Depending on if I go with Approach A), I know I would have to change > which message buffer L1 communicates with L2, such that instead of > sending stores through the L2 Request Buffer, I would send it as > follows: > > L1 -> Coaslescing Buffer -> L2 Request Network Buffer -> L2 > instead of > L1 -> L2 Request Network Buffer -> L2 > > But I am not sure how exactly I would go about this if I want to add > this coalescing buffer to sit between the CPU Core and L1 as well? > > Could those familiar with Ruby comment on my thoughts/offer suggestions? > > Thanks > > Malek > _______________________________________________ > m5-dev mailing list > m5-dev@m5sim.org > http://m5sim.org/mailman/listinfo/m5-dev > > _______________________________________________ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev