On 25-Mar-21 8:21 AM, xiangxia.m....@gmail.com wrote:
From: Tonghao Zhang <xiangxia.m....@gmail.com>
The hugepage of different size, 2MB, 1GB may be mounted on
the same directory (e.g /dev/hugepages). Then dpdk
primary process will be blocked. To address this issue,
add the LOCK_NB flags to flock().
$ cat /proc/mounts
...
none /dev/hugepages hugetlbfs rw,seclabel,relatime,pagesize=1024M 0 0
none /dev/hugepages hugetlbfs rw,seclabel,relatime,pagesize=2M 0 0
Add more details for err logs.
Signed-off-by: Tonghao Zhang <xiangxia.m....@gmail.com>
---
lib/librte_eal/linux/eal_hugepage_info.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/lib/librte_eal/linux/eal_hugepage_info.c
b/lib/librte_eal/linux/eal_hugepage_info.c
index d97792cadeb6..1ff76e539053 100644
--- a/lib/librte_eal/linux/eal_hugepage_info.c
+++ b/lib/librte_eal/linux/eal_hugepage_info.c
@@ -451,9 +451,12 @@ hugepage_info_init(void)
hpi->lock_descriptor = open(hpi->hugedir, O_RDONLY);
/* if blocking lock failed */
- if (flock(hpi->lock_descriptor, LOCK_EX) == -1) {
+ if (flock(hpi->lock_descriptor, LOCK_EX | LOCK_NB) == -1) {
RTE_LOG(CRIT, EAL,
- "Failed to lock hugepage directory!\n");
+ "Failed to lock hugepage directory! "
+ "The hugepage dir (%s) was locked by "
+ "other processes or self twice.\n",
+ hpi->hugedir);
break;
}
/* clear out the hugepages dir from unused pages */
Use cases such as "having two hugetlbfs page sizes on the same hugetlbfs
mountpoint" are user error, but i agree that deadlocking is probably not
the way we want to go about it.
An alternative way would be to check if we already have a mountpoint
with the same path, and this would produce a better error message (as a
user, "hugepage dir is locked by self twice" doesn't tell me anything
useful), at a cost of slightly more complicated code.
I'm not sure which way i want to go here. Normally, hugetlbfs shouldn't
be staying locked for long, so i'm wary of adding a LOCK_NB here, so i
feel slightly uneasy about this patch. Do you have any opinions?
Also, do other OS's EALs need similar fix?
--
Thanks,
Anatoly