I am thinking of marking all the locked instructions with IsMemBarrier. Where do you think this flag should appear - in locked_opcodes.isa, or in semaphores.py? I tried adding IsMemBarrier to the instructions in locked_opcodes.isa, but that does not work. I changed the instruction format to BasicOperate, that also does not work.

--
Nilay

On Wed, 26 Oct 2011, Gabe Black wrote:

I think you guys are on the right track. There's a non speculative flag,
a serialize before, and a serialize after. I'm not sure which one is
exactly right, but some combination should be. We should be careful not
to over do it since that might artificially hurt performance, but I
don't *think* the lock prefix is used all that much these days so it
shouldn't be have *too* bad an impact if it isn't perfectly correct.

Gabe

On 10/26/11 09:56, Nilay Vaish wrote:
On Tue, 25 Oct 2011, Steve Reinhardt wrote:

Good questions.  Clearly if we ever let the R part of an RMW
instruction out
to the cache, either we have to commit the instruction or add some
mechanism
to unlock the block.  One solution would be to mark all RMW
instructions as
serializing, which would prevent them from executing speculatively.
That
(or something like it) might be necessary to get the consistency
model right
anyway, since I believe locked accesses act as fences (?? is that right,
Brad?).

Gabe, did you have an alternate solution in mind?

Steve

On Tue, Oct 25, 2011 at 2:15 PM, Nilay Vaish <[email protected]> wrote:

Does this mean that an x86 O3 CPU will never squash an RMW
instruction? I
am posting an instruction + protocol trace for obtained from O3 and
Ruby. In
the first portion, you can see that the O3 CPU issues a locked RMW
with the
read part having sn = 3051 and the write part having sn = 3052. In the
second portion, you can see that 3051 and 3052 are squashed and the
in the
third portion of the trace, these are committed. There are several
things
that I am not able to understand. Why is the RMW squashed, since x86
architecture has to commit the instruction? Secondly, if RMW was being
executed speculatively, then what mechanism exists for informing the
cache
controller about the instruction getting squashed? Thirdly, why was the
instruction committed later on, when it was originally squashed?


When I mark ldstl and stul as non-speculative, the O3 CPU and Ruby
work on an example code in which two threads are incrementing a
counter. Since locked RMW is a fence instruction (Steve suggested this
above and AMD's manual agrees), it seems that the read portion should
commit any of the loads and stores that appear before it in the
program order. This means that ldstl should be marked as memory
barrrier, and similarly stul should also be marked as memory barrier.
But looking at src/arch/x86/isa/microops/ldstop.isa, it does not seem
like that this flags can be currently supported. If others (especially
Steve and Gabe) concurr with my understanding, I can modify the file
to add the memory barrier flag.

--
Nilay
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to