Hi, On 2025-11-20 19:03:47 -0500, Andres Freund wrote: > > MSVC's _InterlockedCompareExchange() intrinsic on ARM64 performs the > > atomic operation but does NOT emit the necessary Data Memory Barrier > > (DMB) instructions [4][5]. > > I couldn't reproduce this result when playing around on godbolt. By specifying > /arch:armv9.4 msvc can be convinced to emit the code for the intrinsics inline > (at least for most of them). And that makes it visible that > _InterlockedCompareExchange() results in a "casal" instruction. Looking that > up shows: > > https://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions/CASA--CASAL--CAS--CASL--CASAL--CAS--CASL--A64- > which includes these two statements: > "CASA and CASAL load from memory with acquire semantics." > "CASL and CASAL store to memory with release semantics."
Further evidence for that is that https://learn.microsoft.com/en-us/windows/win32/api/winnt/nf-winnt-interlockedcompareexchange states: "This function generates a full memory barrier (or fence) to ensure that memory operations are completed in order." (note that we are using the function, not the intrinsic for TAS()) Greetings, Andres
