Given the below patch; we've now got an unconditional full global barrier in, does this make the MCS spinlock RCsc ?
The 'problem' is that this barrier can happen before we actually acquire the lock. That is, if we hit arch_mcs_spin_lock_contended() _that_ will be the acquire barrier and we end up with a SYNC in between unlock and lock -- ie. not an smp_mb__after_unlock_lock() equivalent. --- Subject: locking/mcs: Fix ordering for mcs_spin_lock() From: Peter Zijlstra <[email protected]> Date: Mon Feb 1 15:11:28 CET 2016 Similar to commit b4b29f94856a ("locking/osq: Fix ordering of node initialisation in osq_lock") the use of xchg_acquire() is fundamentally broken with MCS like constructs. Furthermore, it turns out we rely on the global transitivity of this operation because the unlock path observes the pointer with a READ_ONCE(), not an smp_load_acquire(). This is non-critical because the MCS code isn't actually used and mostly serves as documentation, a stepping stone to the more complex things we've build on top of the idea. Cc: Will Deacon <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Reported-by: Andrea Parri <[email protected]> Fixes: 3552a07a9c4a ("locking/mcs: Use acquire/release semantics") Signed-off-by: Peter Zijlstra (Intel) <[email protected]> --- kernel/locking/mcs_spinlock.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) --- a/kernel/locking/mcs_spinlock.h +++ b/kernel/locking/mcs_spinlock.h @@ -67,7 +67,13 @@ void mcs_spin_lock(struct mcs_spinlock * node->locked = 0; node->next = NULL; - prev = xchg_acquire(lock, node); + /* + * We rely on the full barrier with global transitivity implied by the + * below xchg() to order the initialization stores above against any + * observation of @node. And to provide the ACQUIRE ordering associated + * with a LOCK primitive. + */ + prev = xchg(lock, node); if (likely(prev == NULL)) { /* * Lock acquired, don't need to set node->locked to 1. Threads

