I can see that your system has a NUMA architecture. My experience on DPDK and NUMA is 3 years old and I remembered that there was some issues.
I never manage to allocate huge pages in a NUMA architecture. I think that something need to be done at kernel boot time to avoid the fragmentation of the physical memory in the virtual address space but I never spend enough time to find out what should be done. And also, in a NUMA architecture, you can only access the physical memory that is dedicated to the CPU where your DPDK "core" are running. ?"Bowl of rice will raise a benefactor, a bucket of rice will raise a enemy.", Chinese proverb. FREYNET Marc Alcatel-Lucent France Centre de Villarceaux Route de Villejust 91620 NOZAY France Tel:? +33 (0)1 6040 1960 Intranet: 2103 1960 marc.freynet at nokia.com -----Original Message----- From: users [mailto:[email protected]] On Behalf Of EXT Sergio Gonzalez Monroy Sent: lundi 25 janvier 2016 14:50 To: Alain Gautherot; users at dpdk.org Subject: Re: [dpdk-users] Using DPDK for contiguous physical memory allocation On 23/01/2016 00:20, Alain Gautherot wrote: > Hello, > > I came across DPDK in a thread @ > http://stackoverflow.com/questions/4401912/linux-contiguous-physical-memory-from-userspace > (bottom reply from mrsmith) and wanted to see if I can use rte_malloc() to > allocate large blocks of contiguous physical memory (16GB or even 32GB at > some point). > > The platform I'm working on has an FPGA that shares host memory with the > x86_64 cores via a QPI link. > The FPGA crunches data directly from host memory and uses physical addresses > (mostly a QPI limitation, but it is also dictated by performance > considerations and the ability to make the best possible use of multiple > memory controllers). > The data shared is 16GB or up to 32GB and could be provided as multiple > descriptors to the FPGA, but that still means that each descriptor is in the > order of several GBytes each. > I understand that allocation may fail, but is ok for now, since I'm still in > the proof-of-concept stage, trying to rule things out. > > My sample application attempts to allocate memory by chunks of 100MB like so: > > int main(int argc, char **argv) > { > int ret; > > ret = rte_eal_init(argc, argv); > if (ret < 0) { > rte_panic("Cannot init EAL\n"); > } > > int i; > for (i = 1; i <= 100; ++i) { > size_t allocsize = i * 100*1000*1000; > > printf(" Allocating %3.1fGB: ", ((float )i)/10.0f); > fflush(stdout); > void* ptr = rte_malloc(NULL, allocsize, 0U); > if (ptr != NULL) { > printf("PASS\n"); > rte_free(ptr); > } else { > printf("fail\n"); > } > } > > printf("Done\n"); > return 0; > } > > I get a consistent crash @ the 2.2GB mark: > (gdb) r -c f -n 4 > ... > EAL: PCI device 0000:06:00.1 on NUMA socket 0 > EAL: probe driver: 8086:1521 rte_igb_pmd > EAL: Not managed by a supported kernel driver, skipped > Allocating 0.1GB: fail > Allocating 0.2GB: fail > ... > Allocating 2.0GB: fail > Allocating 2.1GB: fail > Allocating 2.2GB: > Program received signal SIGSEGV, Segmentation fault. > 0x00000000004c6770 in malloc_elem_init (elem=0x800070eaa880, > heap=0x7ffff7fe561c, mz=0x7ffff7fb2c1c, size=2200000064) > at /home/alaing/INTEL/dpdk-2.0.0/lib/librte_malloc/malloc_elem.c:61 > 61 elem->heap = heap; > Missing separate debuginfos, use: debuginfo-install > glibc-2.12-1.149.el6_6.5.x86_64 > (gdb) bt > ... > #0 0x00000000004c6770 in malloc_elem_init (elem=0x800070eaa880, > heap=0x7ffff7fe561c, mz=0x7ffff7fb2c1c, size=2200000064) > at /home/alaing/INTEL/dpdk-2.0.0/lib/librte_malloc/malloc_elem.c:61 > #1 0x00000000004c694e in split_elem (elem=0x7ffff3e00000, > split_pt=0x800070eaa880) at > /home/alaing/INTEL/dpdk-2.0.0/lib/librte_malloc/malloc_elem.c:121 > #2 0x00000000004c6bda in malloc_elem_alloc (elem=0x7ffff3e00000, > size=18446744071614584320, align=64) > at /home/alaing/INTEL/dpdk-2.0.0/lib/librte_malloc/malloc_elem.c:223 > #3 0x00000000004c736e in malloc_heap_alloc (heap=0x7ffff7fe561c, type=0x0, > size=18446744071614584320, align=64) > at /home/alaing/INTEL/dpdk-2.0.0/lib/librte_malloc/malloc_heap.c:167 > #4 0x00000000004c0aa1 in rte_malloc_socket (type=0x0, > size=18446744071614584320, align=0, socket_arg=-1) > at /home/alaing/INTEL/dpdk-2.0.0/lib/librte_malloc/rte_malloc.c:89 > #5 0x00000000004c0b5b in rte_malloc (type=0x0, size=18446744071614584320, > align=0) at /home/alaing/INTEL/dpdk-2.0.0/lib/librte_malloc/rte_malloc.c:115 > #6 0x000000000041ca6e in main (argc=5, argv=0x7fffffffdd48) at > /home/alaing/INTEL/dpdk-2.0.0/examples/hugephymem/main.c:66 > > > Has anybody seen such an issue? > Could I be misusing RTE somehow? > What options are you running your DPDK app with? Can you also provide the full initialization log and hugepage info? Sergio > Thanks for your time, > Alain > > > -- > Alain Gautherot > Edico Genome >
