Hi, On 16.02.2017 16:26, Tan, Jianfeng wrote: > Hi, > >> -----Original Message----- >> From: Ilya Maximets [mailto:i.maxim...@samsung.com] >> Sent: Thursday, February 16, 2017 9:01 PM >> To: dev@dpdk.org; David Marchand; Gonzalez Monroy, Sergio >> Cc: Heetae Ahn; Yuanhan Liu; Tan, Jianfeng; Neil Horman; Pei, Yulong; Ilya >> Maximets; sta...@dpdk.org >> Subject: [PATCH] mem: balanced allocation of hugepages >> >> Currently EAL allocates hugepages one by one not paying >> attention from which NUMA node allocation was done. >> >> Such behaviour leads to allocation failure if number of >> available hugepages for application limited by cgroups >> or hugetlbfs and memory requested not only from the first >> socket. >> >> Example: >> # 90 x 1GB hugepages availavle in a system >> >> cgcreate -g hugetlb:/test >> # Limit to 32GB of hugepages >> cgset -r hugetlb.1GB.limit_in_bytes=34359738368 test >> # Request 4GB from each of 2 sockets >> cgexec -g hugetlb:test testpmd --socket-mem=4096,4096 ... >> >> EAL: SIGBUS: Cannot mmap more hugepages of size 1024 MB >> EAL: 32 not 90 hugepages of size 1024 MB allocated >> EAL: Not enough memory available on socket 1! >> Requested: 4096MB, available: 0MB >> PANIC in rte_eal_init(): >> Cannot init memory >> >> This happens beacause all allocated pages are >> on socket 0. > > For such an use case, why not just use "numactl --interleave=0,1 <DPDK app> > xxx"?
Unfortunately, interleave policy doesn't work for me. I suspect kernel configuration blocks this or I don't understand something in kernel internals. I'm using 3.10 rt kernel from rhel7. I tried to set up MPOL_INTERLEAVE in code and it doesn't work for me. Your example with numactl doesn't work too: # Limited to 8GB of hugepages cgexec -g hugetlb:test testpmd --socket-mem=4096,4096 EAL: Setting up physically contiguous memory... EAL: SIGBUS: Cannot mmap more hugepages of size 1024 MB EAL: 8 not 90 hugepages of size 1024 MB allocated EAL: Hugepage /dev/hugepages/rtemap_0 is on socket 0 EAL: Hugepage /dev/hugepages/rtemap_1 is on socket 0 EAL: Hugepage /dev/hugepages/rtemap_2 is on socket 0 EAL: Hugepage /dev/hugepages/rtemap_3 is on socket 0 EAL: Hugepage /dev/hugepages/rtemap_4 is on socket 0 EAL: Hugepage /dev/hugepages/rtemap_5 is on socket 0 EAL: Hugepage /dev/hugepages/rtemap_6 is on socket 0 EAL: Hugepage /dev/hugepages/rtemap_7 is on socket 0 EAL: Not enough memory available on socket 1! Requested: 4096MB, available: 0MB PANIC in rte_eal_init(): Cannot init memory Also, using numactl will affect all the allocations in application. This may cause additional unexpected issues. > > Do you see use case like --socket-mem 2048,1024 and only three 1GB-hugepage > are allowed? This case will work with my patch. But the opposite one '--socket-mem=1024,2048' will fail. To be clear, we need to allocate all required memory at first from each numa node and then allocate all other available pages in round-robin fashion. But such solution looks a little ugly. What do you think? Best regards, Ilya Maximets.