Hello, I am interested in implementing a storebuffer (coalescing buffer) for Ruby's Memory Model in M5/GEM5 for use in my current research.
I wish to be able to coalesce speculative stores + non-speculative stores to the same cache line and then flush them to the cache during certain acquire/release constructs. I see that there was an existing directory called storebuffer, but was removed not too long ago. Reading the associated thread on the mailing list it seems that it was removed because it is not in use (given that O3 is not yet functional with Ruby), nor was never actually even used in the original GEM implementation. Here is the link to that thread: http://www.mail-archive.com/m5-dev@m5sim.org/msg10575.html In further reading of that thread, I see that there is/was general consensus that the Ruby Store Buffer will be merged with M5 O3's LSQ. For my research, O3 CPU Model is not a requirement, although storebuffers tend to be used typically only in O3 execution. For what I need to do, my specific question is as follows: A) Would it be better/easier to implement a new Buffer (similar to the MessageBuffer class) from the Ruby Side or B) actually reuse M5's existing O3's LSQ buffer in the Timing CPU Model. I think that A) might be the easier method to go for the following reasons: 1) It seems that the Sequencer class already has functionality to support coalescing stores to the same cache line (in reading the previous storebuffer thread) 2) This would make the coalescing buffer CPU Model independent 3) Avoid having to change the Timing CPU Code which may make it more likely to mess up how the CPU Model handles other memory related things (ISA-Dependent Memory references, split data requests, prefetching, etc). 4) Allows me to make it a Ruby Only change on the Ruby Code side of things as opposed to the M5 side of things. However, my hesitation with this approach is because 1) the way the Sequencer operates, it is the interface between the CPU Core and the Ruby Memory Model (converting M5 requests to Ruby Requests and what not), so 'logically' I guess it might make more sense to implement the store buffer before Ruby sees the store requests, and just have the sequencer do its thing with the coalescing? 2) The conclusion of the previous storebuffer thread was that work is currently?/will be done implementing the store buffer on the M5 side of things. Depending on if I go with Approach A), I know I would have to change which message buffer L1 communicates with L2, such that instead of sending stores through the L2 Request Buffer, I would send it as follows: L1 -> Coaslescing Buffer -> L2 Request Network Buffer -> L2 instead of L1 -> L2 Request Network Buffer -> L2 But I am not sure how exactly I would go about this if I want to add this coalescing buffer to sit between the CPU Core and L1 as well? Could those familiar with Ruby comment on my thoughts/offer suggestions? Thanks Malek _______________________________________________ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev