Re: [gem5-dev] Review Request: Forward invalidations from Ruby to O3 CPU

Nilay Vaish Tue, 08 Nov 2011 18:12:09 -0800

On Wed, 2 Nov 2011, Nilay Vaish wrote:

On Fri, 28 Oct 2011, Beckmann, Brad wrote:
Let???s move this conversation to just the email thread.
I suspect we may be talking past each other, so let???s talk about thecomplete implementations not just Ruby. There are multiple ways one canimplement the store portion of x86-TSO. I???m not sure what the O3 modeldoes, but here are a few possibilities:
- Do not issue any part of the store to the memory system when theinstruction is executed. Instead, simply buffer it in the LSQ until theinstruction retires, then buffer in the store buffer after retirement. Onlywhen the store reaches the head of the store buffer, issue it to Ruby. Thenext store is not issued to Ruby until the previous store head completes,maintaining correct store ordering.
- Do not issue any part of the store to the memory system when theinstruction is executed. Instead, simply buffer it in the LSQ until theinstruction retires. Once it retires and enters the store buffer and weissue the address request to Ruby (no L1 data update). Ruby forwardsprobes/replacemetns to the store buffer and if the store buffer sees aprobe/replacement to an address who???s address request has alreadycompleted, the store buffer reissues the request. Once the store reachesthe head of the store buffer, double check with Ruby that write permissionsstill exist in the L1.
- Issue the store address (no L1 data update) to Ruby when the instructionis executed. When it retires, it enters the store buffer. Ruby forwardsprobes/replacemetns to the LSQ+store buffer and if either sees aprobe/replacement to an address who???s address request has alreadycompleted, the request reissues (several policies exist on when to reissuethe request). Once the store reaches the head of the store buffer, doublecheck with Ruby that write permissions still exist in the L1.
Do those scenarios make sense to you? I believe we can implement any oneof them without modifying Ruby???s core functionality. If you areenvisioning or if O3 implements something completely different, please letme know.
1. What's current memory model that the O3 CPU implements? Do we wantmultiple memory models to co-exist? We might want to have both SC and TSO,though Alpha had a weaker model.
2. I think we should try to stick what the O3 CPU implements currently,meaning we should not change the stage when the store is issued to the cache.I am more concerned about how multiple ports get handled.

Looking at the trace generated by the toy application I use for testingthe O3 CPU and Ruby combination, I have been able to confirm my suspicionthat stores can become visible to the rest of the system in an orderdifferent from the program order.

It might be that the classic memory system does not allow stores to go outof order. Or that the initial implementation of the O3 CPU was for aweaker memory model like that of Alpha architecture (Prof. Hill suggestedthat this might be the case).

Overall I am still not clear on how to make O3 and Ruby work togethercorrectly for SC or TSO, in case when multiple stores can be issued tothe memory system in parallel.


--
Nilay

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Review Request: Forward invalidations from Ruby to O3 CPU

Reply via email to