Hi,

This is a patchlevel release of the Userspace RCU library.

The most relevant change in this release is the removal of
a redundant memory barrier on x86 for store and RMW operations
with the CMM_SEQ_CST_FENCE memory ordering. This addresses
a performance regression for users of the pre-0.15 uatomic API
that build against a liburcu configured to use compiler builtins
for atomics (--enable-compiler-atomic-builtins).

As a reminder, the CMM_SEQ_CST_FENCE MO is a superset of SEQ_CST:
it provides sequential consistency _and_ acts as a full memory
barrier, similarly to the semantic associated with cmpxchg() and
atomic_add_return() within the LKMM.

Here is the rationale for this change:

/*
 * On x86, a atomic store with sequential consistency is always implemented with
 * an exchange operation, which has an implicit lock prefix when a memory 
operand
 * is used.
 *
 * Indeed, on x86, only loads can be re-ordered with prior stores. Therefore,
 * for keeping sequential consistency, either load operations or store
 * operations need to have a memory barrier. All major toolchains have selected
 * the store operations to have this barrier to avoid penalty on load
 * operations.
 *
 * Therefore, assuming that the used toolchain follows this convention, it is
 * safe to rely on this implicit memory barrier to implement the
 * `CMM_SEQ_CST_FENCE` memory order and thus no further barrier need to be
 * emitted.
 */
#define cmm_seq_cst_fence_after_atomic_store(...)       \
        do { } while (0)

/*
 * Let the default implementation (emit a memory barrier) after load operations
 * for the `CMM_SEQ_CST_FENCE`.  The rationale is explained above for
 * `cmm_seq_cst_fence_after_atomic_store()`.
 */
/* #define cmm_seq_cst_fence_after_atomic_load(...) */


/*
 * On x86, atomic read-modify-write operations always have a lock prefix either
 * implicitly or explicitly for sequential consistency.
 *
 * Therefore, no further memory barrier, for the `CMM_SEQ_CST_FENCE` memory
 * order, needs to be emitted for these operations.
 */
#define cmm_seq_cst_fence_after_atomic_rmw(...) \
        do { } while (0)

Changelog:

2025-11-04 Userspace RCU 0.15.4
        * uatomic: Fix redundant memory barriers for atomic builtin operations
        * Cleanup: Remove useless declarations from urcu-qsbr
        * src/urcu-bp.c: assert => urcu_posix_assert
        * ppc.h: improve ppc64 caa_get_cycles on Darwin

Thanks,

Mathieu

Project website: https://liburcu.org
Git repository: https://git.liburcu.org/userspace-rcu.git

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Reply via email to