Re: [gem5-users] O3 compare-and-swap appears not to be atomic

Gabriel Michael Black Sat, 17 Sep 2011 16:39:11 -0700

Hi Meredydd. I'd say this isn't a bug, perse, but it is wrong.Basically the support for locking memory operations is incomplete.

The way this is supposed to work is that a load with the LOCKED flagset will lock a chunk of memory, and then a subsequent store with theLOCKED flag set will unlock it. All stores with LOCKED set must bepreceded by a load with that set. You could think of the load asacquiring a mutex and the store as releasing it.

In atomic mode, because gem5 is single threaded and because atomicmemory accesses complete immediately, the only thing you need to do tomake sure locked memory accesses aren't interrupted by anything is tomake sure the cpu keeps control until the locked section is complete.To do that we just keep track of whether or not we've executed alocked load and don't stop executing instructions until we see alocked store. This is what you're seeing in the atomic mode CPU.

In timing mode, which is what all other CPUs use including the timingsimple CPU, something more complex is needed because memory accessestake "time" and other things can happen while the CPU waits for aresponse. In that case, the locking would have to actually happen inthe memory system and the various components (caches, memory, orsomething else) would have to keep track of what areas of memory (ifany) are currently locked. This is the part that isn't yet implemented.

So in summary, yes it is known to not work properly, but I wouldn'tcall it a bug, I'd say that it's just not finished yet.


Gabe

Quoting Meredydd Luff <[email protected]>:

It appears that the CAS (LOCK; CMPXCHGx) instruction doesn't do what
it says on the tin, at least using the O3 model and X86_SE. When I run
the following code (inside a container that runs this code once on
each of four processors):

    volatile unsigned long x;
    [...]
    for(a=0; a<1000; a++) {
        while(lastx = *x, oldx = cas(x, lastx, lastx+1), oldx != lastx);


...I get final x values of 1200 or so (rather than 4000, as would
happen if the compare-and-swap were atomic). This is using the
standard se.py, and a fresh checkout of the gem5 repository - my
command line is:

build/X86_SE/m5.opt configs/example/se.py -d --caches -n 4 -c/path/to/my/binary



Is this a known bug? Looking at the x86 microcode, it appears that the
relevant microops are ldstl and stul. Their only difference from what
appears to be their unlocked equivalents (ldst and st) is the addition
of the Request::LOCKED flag. A quick grep indicates that that LOCKED
flag is only accessed by the Request::isLocked() accessor function,
and that isLocked() is not referenced anywhere except twice in
cpu/simple/atomic.cc.

Unless I'm missing something, it appears that atomic memory accesses
are simply not implemented. Is this true?

Meredydd


PS - This is the CAS I'm using:

static inline unsigned long cas(volatile unsigned long* ptr, unsigned
long old, unsigned long _new)
{
    unsigned long prev;
    asm volatile("lock;"
                 "cmpxchgq %1, %2;"
                 : "=a"(prev)
                 : "q"(_new), "m"(*ptr), "0"(old)
                 : "memory");
    return prev;
}



PPS - I searched around this issue, and the only relevant thing I
found was a mailing list post from last year, indicating that ldstl
and stul were working for someone (no indication that was using O3,
though): http://www.mail-archive.com/[email protected]/msg07297.html
This would indicate that at least one CPU model does support atomicity
- but even looking in atomic.cc, I can't immediately see why that
would work!

There is some code for handling a flag called
Request::MEM_SWAP_COND/isCondSwap(), but it appears to be generated
only by the SPARC ISA, and examined only by the simple timing and
atomic models.
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users



_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] O3 compare-and-swap appears not to be atomic

Reply via email to