On Tue, 25 Oct 2011, Steve Reinhardt wrote:
Good questions. Clearly if we ever let the R part of an RMW instruction out
to the cache, either we have to commit the instruction or add some mechanism
to unlock the block. One solution would be to mark all RMW instructions as
serializing, which would prevent them from executing speculatively. That
(or something like it) might be necessary to get the consistency model right
anyway, since I believe locked accesses act as fences (?? is that right,
Brad?).
Gabe, did you have an alternate solution in mind?
Steve
On Tue, Oct 25, 2011 at 2:15 PM, Nilay Vaish <[email protected]> wrote:
Does this mean that an x86 O3 CPU will never squash an RMW instruction? I
am posting an instruction + protocol trace for obtained from O3 and Ruby. In
the first portion, you can see that the O3 CPU issues a locked RMW with the
read part having sn = 3051 and the write part having sn = 3052. In the
second portion, you can see that 3051 and 3052 are squashed and the in the
third portion of the trace, these are committed. There are several things
that I am not able to understand. Why is the RMW squashed, since x86
architecture has to commit the instruction? Secondly, if RMW was being
executed speculatively, then what mechanism exists for informing the cache
controller about the instruction getting squashed? Thirdly, why was the
instruction committed later on, when it was originally squashed?
When I mark ldstl and stul as non-speculative, the O3 CPU and Ruby work on
an example code in which two threads are incrementing a counter. Since
locked RMW is a fence instruction (Steve suggested this above and AMD's
manual agrees), it seems that the read portion should commit any of the
loads and stores that appear before it in the program order. This means
that ldstl should be marked as memory barrrier, and similarly stul should
also be marked as memory barrier. But looking at
src/arch/x86/isa/microops/ldstop.isa, it does not seem like that this
flags can be currently supported. If others (especially Steve and Gabe)
concurr with my understanding, I can modify the file to add the memory
barrier flag.
--
Nilay
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev