在 2025/11/11 下午4:09, Xi Ruoyao 写道:
On Tue, 2025-11-11 at 14:31 +0800, Lulu Cheng wrote:
在 2025/11/8 下午12:29, Xi Ruoyao 写道:
As [1] says, we cannot mix up lock-free and locking atomics for one
object. For example assume atom = (0, 0) initially, if we have a
locking "atomic" xor running on T0 and a lock-free store running on
T1
concurrently:
T0 | T1
-----------------------------+-------------------------------------
acquire_lock |
t0 = atom[0] |
/* some CPU cycles */ | atom = (1, 1) /* lock-free atomic */
t1 = atom[1] |
t0 ^= 1 |
t1 ^= 1 |
atom[0] = t0 |
atom[1] = t1 |
release_lock |
we get atom = (0, 1), but the atomicity of xor and store should
guarantee that atom is either (0, 0) or (1, 1).
So, if we want to use a lock-free 16B atomic operation, we need both
LSX
and SCQ even if that specific operation only needs one of them. To
make
things worse, one may link a TU compiled with -mlsx -mscq and
another
without them together, then if we want to use the lock-free 16B
atomic
operations in the former, we must make libatomic also use the lock-
free
16B atomic operation for the latter so we need to add ifuncs for
libatomic, similar to the discussion about i386 vs. i486 in [1].
Implementing and building the ifuncs currently requires:
- Glibc, because the ifunc resolver interface is libc-specific
- Linux, because the HWCAP bit for LSX is kernel-specific
- A recent enough assembler at build time to recognize sc.q
So the approach here is: only allow 16B lock-free atomic operations
in
the compiler if the criteria above is satisfied, and ensure
libatomic to
use those lock-free operations on capable hardware (via ifunc unless
both LSX and SCQ are already enabled by the builder) if the compiler
allows 16B lock-free atomic.
[1]: https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary
/* snip */
+
+typedef struct __ifunc_arg_t {
+ unsigned long _size;
+ unsigned long _hwcap;
+} __ifunc_arg_t;
+
We didn't use any parameters for the resolver function, can we remove
this?
Yes, I just forgot to remove it.
Ok with this removed?
Ok!
But I'm a little curious, in which application did this problem actually
surface?