The prefault step in alloc_seg() reads a value from the hugepage and
writes it back unchanged to force the kernel to commit the backing
page. The read and write were not atomic, which races with concurrent
access to the same physical page from a secondary process attaching
to the hugetlbfs-backed mapping during rte_eal_init().
Replace the non-atomic load+store with a single atomic fetch-or of
zero. This touches the page with an atomic read-modify-write without
changing its contents, eliminating the race while preserving the
original intent of forcing a write fault.
Fixes: 0f1631be24bd ("mem: fix page fault trigger")
Cc: [email protected]
Reported-by: Michal Sieron <[email protected]>
Signed-off-by: Stephen Hemminger <[email protected]>
---
.mailmap | 1 +
lib/eal/linux/eal_memalloc.c | 7 ++++---
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/.mailmap b/.mailmap
index 4d26d9c286..07c49eb32f 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1086,6 +1086,7 @@ Michal Mazurek <[email protected]>
Michal Michalik <[email protected]>
Michal Nowak <[email protected]>
Michal Schmidt <[email protected]>
+Michal Sieron <[email protected]>
Michal Swiatkowski <[email protected]>
Michal Wilczynski <[email protected]>
Michał Mirosław <[email protected]> <[email protected]>
diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
index a39bc31c7b..e73a0c11a6 100644
--- a/lib/eal/linux/eal_memalloc.c
+++ b/lib/eal/linux/eal_memalloc.c
@@ -25,6 +25,7 @@
#include <linux/falloc.h>
#include <linux/mman.h> /* for hugetlb-related mmap flags */
+#include <rte_atomic.h>
#include <rte_common.h>
#include <rte_log.h>
#include <rte_eal.h>
@@ -597,10 +598,10 @@ alloc_seg(struct rte_memseg *ms, void *addr, int
socket_id,
/* we need to trigger a write to the page to enforce page fault and
* ensure that page is accessible to us, but we can't overwrite value
- * that is already there, so read the old value, and write itback.
- * kernel populates the page with zeroes initially.
+ * that is already there.
+ * Use an atomic OR with zero to touch the page without changing its
contents.
*/
- *(volatile int *)addr = *(volatile int *)addr;
+ (void)rte_atomic_fetch_or_explicit((int *)addr, 0,
rte_memory_order_relaxed);
iova = rte_mem_virt2iova(addr);
if (iova == RTE_BAD_PHYS_ADDR) {
--
2.53.0