> -----Original Message----- > From: Ian Trick [mailto:[email protected]] > Sent: Friday, November 17, 2017 5:50 PM > To: Dumitrescu, Cristian <[email protected]>; [email protected] > Subject: Re: qos_sched in DPDK 17.11.0 fails to initialize mbuf pool > > On 2017-11-17 04:19 AM, Dumitrescu, Cristian wrote: > > > > > >> -----Original Message----- > >> From: Ian Trick [mailto:[email protected]] > >> Sent: Friday, November 17, 2017 1:24 AM > >> To: [email protected] > >> Cc: Dumitrescu, Cristian <[email protected]> > >> Subject: qos_sched in DPDK 17.11.0 fails to initialize mbuf pool > >> > >> Hi. I'm having an issue starting the qos_sched example program. > >> > >> # ./examples/qos_sched/build/qos_sched --no-huge -l 1,2,3 --vdev > >> net_af_packet0,iface=eth1 -- --pfc "0,0,2,3" --cfg > >> examples/qos_sched/profile_ov.cfg > >> > >> EAL: Detected 16 lcore(s) > >> EAL: Probing VFIO support... > >> EAL: Started without hugepages support, physical addresses not available > >> EAL: PCI device 0000:08:00.0 on NUMA socket -1 > >> EAL: Invalid NUMA socket, default to 0 > >> EAL: probe driver: 8086:10d3 net_e1000_em > >> PMD: Initializing pmd_af_packet for net_af_packet0 > >> PMD: net_af_packet0: AF_PACKET MMAP parameters: > >> PMD: net_af_packet0: block size 4096 > >> PMD: net_af_packet0: block count 256 > >> PMD: net_af_packet0: frame size 2048 > >> PMD: net_af_packet0: frame count 512 > >> PMD: net_af_packet0: creating AF_PACKET-backed ethdev on numa > socket 0 > >> EAL: Error - exiting with code: 1 > >> Cause: Cannot init mbuf pool for socket 0 > >> > > > > Personally I never used this application with --no-huge or with AF_PACKET, > so I suggest you start from the configuration known to work (as detailed in > the Sample App Guide) and then change/add one variable at a time to see > which change triggers the mempool issue. > > > > This app needs large amounts of memory for the mempool, as traffic > management is buffering lots of packets in lots of queues. Out typical tests > are done with 4K pipes/output port (64K queues/output port) so we > provision mempool to have 2M buffers for each output port. The size of the > mempool is hardcoded in the application. > > Can I configure this to run with fewer queues or something so that it > requires less memory. I thought running with profile_ov.cfg might have > lower memory requirements since it includes: > > number of pipes per subport = 32 > compared to 4096 in the other configuration file. So I figured there > would be fewer queues and buffers? But I only have 4GB available on the > device I have if I want to test something that isn't AF_PACKET. >
Digging in the source code, I found that you can tweak the mempool size through this macro: //file "main.h" #define NB_MBUF (2*1024*1024) > > > >> > >> This is version 17.11.0 from the repo. My RTE_TARGET is > >> x86_64-native-linuxapp-clang. eth1 is a veth. I've tried running with > >> `-m` and using a low value but the issue still happens. > >> > >> From what I can tell, rte_pktmbuf_pool_create() is failing and rte_errno > >> is set to EINVAL. > >> > >> In librte_mempool/rte_mempool.c, the function > >> rte_mempool_populate_virt() is succeeding this test and returning - > EINVAL: > >> > >> if (RTE_ALIGN_CEIL(len, pg_sz) != len) > >> return -EINVAL; > >> > >> In that context, len is mz->len, the length of a memzone passed by the > >> caller, rte_mempool_populate_default(). Which got it here: > >> > >> mz = rte_memzone_reserve_aligned(mz_name, size, > >> mp->socket_id, mz_flags, align); > >> /* not enough memory, retry with the biggest zone we have */ > >> if (mz == NULL) > >> mz = rte_memzone_reserve_aligned(mz_name, 0, > >> mp->socket_id, mz_flags, align); > >> > >> This fails the first call, and succeeds the second when it passes 0 as > >> the size. memzone_reserve_aligned_thread_unsafe(), in > >> librte_eal/common/eal_common_memzone.c, gets the length this way: > >> > >> requested_len = find_heap_max_free_elem(&socket_id, align); > >> > >> So the align value is 4096. But the value returned by > >> find_heap_max_free_elem() isn't aligned to that -- I think? Since it > >> fails the check later on. > >> > >> I'm not sure if this is a thing with my environment where I don't have > >> enough memory? (Although I would have expected a different error for > >> that.) Or I don't have the right program arguments? Or one of these > >> functions isn't doing what it's supposed to?
