================
@@ -55,11 +55,27 @@ __chkfeat(uint64_t __features) {
/* 7.5 Swap */
static __inline__ uint32_t __attribute__((__always_inline__, __nodebug__))
__swp(uint32_t __x, volatile uint32_t *__p) {
----------------
efriedma-quic wrote:
"relaxed" basically means we don't emit any barriers, so the compiler and CPU
can move memory operations to unrelated addresses across it.
In the "modern" case, it might make sense to also use the atomicrmw sequence;
it should lower to the same thing, and the compiler understands atomics better
than ldrex and strex (and the rules for when ldrex and strex are well-defined
are generally weird).
The backend can't select `atomicrmw volatile xchg` to SWP on targets that don't
have lock-free atomics: if a target doesn't support cmpxchg for a given width,
we have to go through libatomic for all atomics of that width so the locking
works consistently.
There are some targets that have lock-free atomics even though they don't have
ldrex/strex. In particular, on armv6 in Thumb mode, we can just switch to arm
mode. And on all Linux targets, the kernel exposes stubs that implement atomic
ops. On those targets, it's probably better to use the __sync_ libcall, not
SWP, for the sake of forward-compatibility with systems that don't support SWP.
Performance should be reasonable; it's a libcall, but it doesn't involve any
locks or anything like that.
https://github.com/llvm/llvm-project/pull/151354
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits