Hi, Samuel Thibault wrote: >> After I read this introduction, I checked some atomic operations >> implementation such as atomic_add_return in Linux for Aphla processors. >> Before and after these operations change the variable, they put memory >> barriers. So it's something like this: >> smp_mb(); >> operations; >> smp_mb(); >> >> But the problem is: on SMP, a variable's value can always been changed by >> another processor after the first memory barrier is called. Thus, the CPU >> does an operation on the stale value. There seems to be no way that we can >> guarantee that the variable has the latest value at the moment the CPU is >> doing the operation on it. > > Yes, that's why the operations have a loop that keeps retrying until the > result is as expected. I didn't notice there was a loop in the implementation of atomic operations. It seems most architectures provide a special load instruction that sets a flag/register, and a conditional store instruction. I found some information about these instructions but I still have something I'm not sure of. It seems to me that it should work as follows:
If processor A loads a variable in memory to the register with the special load instruction, it should monitor *all* cache lines specified by the address (including the ones on other processors). If any other processors change the variable in their own cache, processor A has to clear its flag. After it changes the variable and tries to store the value to memory, the conditional store instruction might fail. If the store fails, processor A has to *invalidate* its own cache line so that it can load the latest value in the next iteration. Is the process above correct? Zheng Da