Hi,

Samuel Thibault wrote:
>> After I read this introduction, I checked some atomic operations 
>> implementation such as atomic_add_return in Linux for Aphla processors. 
>> Before and after these operations change the variable, they put memory 
>> barriers. So it's something like this:
>> smp_mb();
>> operations;
>> smp_mb();
>>
>> But the problem is: on SMP, a variable's value can always been changed by 
>> another processor after the first memory barrier is called. Thus, the CPU 
>> does an operation on the stale value. There seems to be no way that we can 
>> guarantee that the variable has the latest value at the moment the CPU is 
>> doing the operation on it.
> 
> Yes, that's why the operations have a loop that keeps retrying until the
> result is as expected.
I didn't notice there was a loop in the implementation of atomic operations. It
seems most architectures provide a special load instruction that sets a
flag/register, and a conditional store instruction. I found some information
about these instructions but I still have something I'm not sure of. It seems to
me that it should work as follows:

If processor A loads a variable in memory to the register with the special load
instruction, it should monitor *all* cache lines specified by the address
(including the ones on other processors). If any other processors change the
variable in their own cache, processor A has to clear its flag.

After it changes the variable and tries to store the value to memory, the
conditional store instruction might fail. If the store fails, processor A has to
*invalidate* its own cache line so that it can load the latest value in the next
iteration.

Is the process above correct?

Zheng Da


Reply via email to