On Thu, 27 Oct 2011, Beckmann, Brad wrote:

Hi Nilay,

I apologize it has taken me a few days to respond. I need to read my gem5-dev email more often.

First off, I just want to be clear that we are only discussing locked prefixed RMW instructions, correct? Non-locked RMW are not an issue.

Right, Ruby does not lock the address in case of non-locked RMW.


In my opinion, the absolute best source to understand the x86 memory model is Sewell et al. http://doi.acm.org/10.1145/1785414.1785443 In the paper, they explain when processors can logically execute locked prefixed instructions in a very clear and intuitive way. As Steve said, locked prefixed instructions act as fences, but they also immediately retire to the memory system to maintain global ordering. Thus the locked prefixed instruction cannot logically complete until all prior lds and sts from that processor have been retired to the memory system. In other words, the load and store buffers must be empty. Furthermore, the locked prefixed instruction must immediately become visible when the locked prefixed instruction retires. In other words, the store buffer cannot hold on to the store value after the core retires the instruction.

I am assuming that if an instruction is marked as a memory barrier, the O3 CPU will drain the load and store buffers before and after the instruction.


I think the main question here is how does the O3 ld/st queue respond to the serialize before, serialize after, and fence flags? Essentially, we need to use the combination of flags that flushes the ld and st buffers before logically executing the load portion of the locked RMW, as well as bypasses the store buffer when executing the store portion of the locked RMW. There are certainly optimizations that can be implemented to maintain that logical behavior, while allowing the hardware to do more parallel execution. However, I would suggest not trying to implement those before getting the core functionality to work using existing mechanisms.

I am in agreement with you.


On a related note, have you thought about how you're going to propagate Ruby probes back to the O3 load buffer? Assuming a snooping load queue, that is one core mechanism that we need to implement to support X86+O3+Ruby. It might be useful for us to discuss different possible interface implementations before you spend too much time writing code.

Brad



I have a patch for this available on review board. This is the link --
http://reviews.gem5.org/r/894/

--
Nilay



-----Original Message-----
From: [email protected] [mailto:gem5-dev-
[email protected]] On Behalf Of Steve Reinhardt
Sent: Thursday, October 27, 2011 10:09 AM
To: gem5 Developer List
Subject: Re: [gem5-dev] Locked RMW in Ruby

Hi Nilay,

I think a memory barrier may not be sufficient... we need to make sure it's
non-speculative as well as ordered (unless we do something more
complicated to deal with a speculative locked read that isn't followed by a
write because it got squashed).

Gabe is a better reference (the only reference?) for the details of the x86
decoder.

Steve

On Thu, Oct 27, 2011 at 8:32 AM, Nilay Vaish <[email protected]> wrote:

I am thinking of marking all the locked instructions with IsMemBarrier.
Where do you think this flag should appear - in locked_opcodes.isa, or
in semaphores.py? I tried adding IsMemBarrier to the instructions in
locked_opcodes.isa, but that does not work. I changed the instruction
format to BasicOperate, that also does not work.

--
Nilay


On Wed, 26 Oct 2011, Gabe Black wrote:

 I think you guys are on the right track. There's a non speculative
flag,
a serialize before, and a serialize after. I'm not sure which one is
exactly right, but some combination should be. We should be careful
not to over do it since that might artificially hurt performance, but
I don't *think* the lock prefix is used all that much these days so
it shouldn't be have *too* bad an impact if it isn't perfectly correct.

Gabe

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to