On Thu, 07 Sep 2017, Prateek Sood wrote:
/* + * __rwsem_down_write_failed_common(sem) + * rwsem_optimistic_spin(sem) + * osq_unlock(sem->osq) + * ... + * atomic_long_add_return(&sem->count) + * + * - VS - + * + * __up_write() + * if (atomic_long_sub_return_release(&sem->count) < 0) + * rwsem_wake(sem) + * osq_is_locked(&sem->osq) + * + * And __up_write() must observe !osq_is_locked() when it observes the + * atomic_long_add_return() in order to not miss a wakeup. + * + * This boils down to: + * + * [S.rel] X = 1 [RmW] r0 = (Y += 0) + * MB RMB + * [RmW] Y += 1 [L] r1 = X + * + * exists (r0=1 /\ r1=0) + */ + smp_rmb();
Instead, how about just removing the release from atomic_long_sub_return_release() such that the osq load is not hoisted over the atomic compound (along with Peter's comment): diff --git a/include/asm-generic/rwsem.h b/include/asm-generic/rwsem.h index 6c6a2141f271..487ce31078ff 100644 --- a/include/asm-generic/rwsem.h +++ b/include/asm-generic/rwsem.h @@ -101,7 +101,7 @@ static inline void __up_read(struct rw_semaphore *sem) */ static inline void __up_write(struct rw_semaphore *sem) { - if (unlikely(atomic_long_sub_return_release(RWSEM_ACTIVE_WRITE_BIAS, + if (unlikely(atomic_long_sub_return(RWSEM_ACTIVE_WRITE_BIAS, &sem->count) < 0)) rwsem_wake(sem); }