On 2017-11-17 10:34 AM, Dumitrescu, Cristian wrote: > > >> -----Original Message----- >> From: Ian Trick [mailto:[email protected]] >> Sent: Friday, November 17, 2017 5:50 PM >> To: Dumitrescu, Cristian <[email protected]>; [email protected] >> Subject: Re: qos_sched in DPDK 17.11.0 fails to initialize mbuf pool >> >> On 2017-11-17 04:19 AM, Dumitrescu, Cristian wrote: >>> >>> >>>> -----Original Message----- >>>> From: Ian Trick [mailto:[email protected]] >>>> Sent: Friday, November 17, 2017 1:24 AM >>>> To: [email protected] >>>> Cc: Dumitrescu, Cristian <[email protected]> >>>> Subject: qos_sched in DPDK 17.11.0 fails to initialize mbuf pool >>>> >>>> Hi. I'm having an issue starting the qos_sched example program. >>>> >>>> # ./examples/qos_sched/build/qos_sched --no-huge -l 1,2,3 --vdev >>>> net_af_packet0,iface=eth1 -- --pfc "0,0,2,3" --cfg >>>> examples/qos_sched/profile_ov.cfg >>>> >>>> EAL: Detected 16 lcore(s) >>>> EAL: Probing VFIO support... >>>> EAL: Started without hugepages support, physical addresses not available >>>> EAL: PCI device 0000:08:00.0 on NUMA socket -1 >>>> EAL: Invalid NUMA socket, default to 0 >>>> EAL: probe driver: 8086:10d3 net_e1000_em >>>> PMD: Initializing pmd_af_packet for net_af_packet0 >>>> PMD: net_af_packet0: AF_PACKET MMAP parameters: >>>> PMD: net_af_packet0: block size 4096 >>>> PMD: net_af_packet0: block count 256 >>>> PMD: net_af_packet0: frame size 2048 >>>> PMD: net_af_packet0: frame count 512 >>>> PMD: net_af_packet0: creating AF_PACKET-backed ethdev on numa >> socket 0 >>>> EAL: Error - exiting with code: 1 >>>> Cause: Cannot init mbuf pool for socket 0 >>>> >>> >>> Personally I never used this application with --no-huge or with AF_PACKET, >> so I suggest you start from the configuration known to work (as detailed in >> the Sample App Guide) and then change/add one variable at a time to see >> which change triggers the mempool issue. >>> >>> This app needs large amounts of memory for the mempool, as traffic >> management is buffering lots of packets in lots of queues. Out typical tests >> are done with 4K pipes/output port (64K queues/output port) so we >> provision mempool to have 2M buffers for each output port. The size of the >> mempool is hardcoded in the application. >> >> Can I configure this to run with fewer queues or something so that it >> requires less memory. I thought running with profile_ov.cfg might have >> lower memory requirements since it includes: >>> number of pipes per subport = 32 >> compared to 4096 in the other configuration file. So I figured there >> would be fewer queues and buffers? But I only have 4GB available on the >> device I have if I want to test something that isn't AF_PACKET. >> > > Digging in the source code, I found that you can tweak the mempool size > through this macro: > //file "main.h" > #define NB_MBUF (2*1024*1024)
Oh right, I remember fiddling with that when trying to get it working --no-huge. Tweaking that worked in this case on a real interface in DPDK mode. Adding --no-huge makes it complain and not start up, so that might be what was happening in my original case. I think we're running with that option because we were having trouble using it under LXC. But I'll look into solving that. Thanks! > >>> >>>> >>>> This is version 17.11.0 from the repo. My RTE_TARGET is >>>> x86_64-native-linuxapp-clang. eth1 is a veth. I've tried running with >>>> `-m` and using a low value but the issue still happens. >>>> >>>> From what I can tell, rte_pktmbuf_pool_create() is failing and rte_errno >>>> is set to EINVAL. >>>> >>>> In librte_mempool/rte_mempool.c, the function >>>> rte_mempool_populate_virt() is succeeding this test and returning - >> EINVAL: >>>> >>>> if (RTE_ALIGN_CEIL(len, pg_sz) != len) >>>> return -EINVAL; >>>> >>>> In that context, len is mz->len, the length of a memzone passed by the >>>> caller, rte_mempool_populate_default(). Which got it here: >>>> >>>> mz = rte_memzone_reserve_aligned(mz_name, size, >>>> mp->socket_id, mz_flags, align); >>>> /* not enough memory, retry with the biggest zone we have */ >>>> if (mz == NULL) >>>> mz = rte_memzone_reserve_aligned(mz_name, 0, >>>> mp->socket_id, mz_flags, align); >>>> >>>> This fails the first call, and succeeds the second when it passes 0 as >>>> the size. memzone_reserve_aligned_thread_unsafe(), in >>>> librte_eal/common/eal_common_memzone.c, gets the length this way: >>>> >>>> requested_len = find_heap_max_free_elem(&socket_id, align); >>>> >>>> So the align value is 4096. But the value returned by >>>> find_heap_max_free_elem() isn't aligned to that -- I think? Since it >>>> fails the check later on. >>>> >>>> I'm not sure if this is a thing with my environment where I don't have >>>> enough memory? (Although I would have expected a different error for >>>> that.) Or I don't have the right program arguments? Or one of these >>>> functions isn't doing what it's supposed to?
