在 2025-10-22 03:58, Pali Rohár 写道:
On Thursday 16 October 2025 20:35:00 LIU Hao wrote:The last one is unnecessary. Initializing a lock doesn't require an atomic operation; only passing it to other threads does. And even when it's necessary to use an atomic operation, `volatile` is not sufficient for ARM64; it has to be done with `__atomic_store_n` which compiles to an STLR instruction.I really was not sure about this one. I was thinking about it... I quite do not understand why the unlock and init have different behavior. Both are setting the spin lock to unlocked state. init function does not use any synchronization or barrier, but the unlock function is using barrier with release semantics.
After a lock is initialized, it may be passed to other threads * via some shared data structure, with proper locking, or * directly or indirectly, as the user-defined argument to `pthread_create()`.Either operation is guarantee that the lock is properly synchronized between these threads, and without this operation the lock is not accessible to other threads. When the creator thread itself accesses the lock, no synchronization is required.
That is, the lock can be initialized as an ordinary datum.
Now, when I was thinking more about it, it is really required for init and unlock to use barrier when the counterpart function (the lock one) always uses full memory barrier? (InterlockedExchangePointer uses the full memory barrier, right?)
The lock operation should probably be an acquire barrier instead of a full barrier, but on x86 an atomic read-modify-write operation is always a full barrier.
On ARM64 the memory order applies to the write part of the atomic operation: https://gcc.godbolt.org/z/GWev5vWqh
AFAK volatile just ensure that compiler does not reorder emitted instructions as part of some compiler optimizations. But volatile does not ensure any synchronization or barrier at HW level. x86 has strong ordering where I think that only store followed by load can be reordered without explicit barrier. So my understanding is that volatile on x86 has a side effect of barrier (which does not apply for arm).
This looks mostly correct.`volatile` means the operation has an effect that is unknown to the compiler, as if it was accessing global memory; and that's why it could be abused for synchronization. I think we had better not abuse `volatile` for this purpose.
But is not there some possibility that compiler could reorder something?
Initialization (of fields of a struct, for example) can happen in any order, or even be combined to SIMD operations. It doesn't matter.
What makes me suspicious even more, why lock and unlock functions mark the memory where tk points as volatile, but the trylock and init functions do not mark it as volatile? I would expect that at least trylock and lock functions would declare variable in the same way as the variable is passed to InterlockedExchangePointer() function as is.
Those need not be `volatile`. -- Best regards, LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
_______________________________________________ Mingw-w64-public mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
