The initialization me->locked=1 in lock() must happen before
    next->locked=0 in unlock(), otherwise a thread may hang forever,
    waiting me->locked become 0. On weak memory systems (such as ARMv8),
    the current implementation allows me->locked=1 to be reordered with
    announcing the node (pred->next=me) and, consequently, to be
    reordered with next->locked=0 in unlock().

    This fix adds a release barrier to pred->next=me, forcing
    me->locked=1 to happen before this operation.

Signed-off-by: Diogo Behrens <diogo.behr...@huawei.com>
---
 lib/librte_eal/include/generic/rte_mcslock.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/include/generic/rte_mcslock.h 
b/lib/librte_eal/include/generic/rte_mcslock.h
index 2bef28351..ce553f547 100644
--- a/lib/librte_eal/include/generic/rte_mcslock.h
+++ b/lib/librte_eal/include/generic/rte_mcslock.h
@@ -68,7 +68,14 @@ rte_mcslock_lock(rte_mcslock_t **msl, rte_mcslock_t *me)
                 */
                return;
        }
-       __atomic_store_n(&prev->next, me, __ATOMIC_RELAXED);
+       /* The store to me->next above should also complete before the node is
+        * visible to predecessor thread releasing the lock. Hence, the store
+        * prev->next also requires release semantics. Note that, for example,
+        * on ARM, the release semantics in the exchange operation is not
+        * strong as a release fence and is not sufficient to enforce the
+        * desired order here.
+        */
+       __atomic_store_n(&prev->next, me, __ATOMIC_RELEASE);
 
        /* The while-load of me->locked should not move above the previous
         * store to prev->next. Otherwise it will cause a deadlock. Need a
-- 
2.28.0


Reply via email to