How many files do you have in your "./dpdk_select" folder? Just the 5 or 6 that Aaron had mentioned in the email below? What happens if you intentionally set the dpdk_select folder name incorrectly?
On Fri, Jul 30, 2021 at 6:42 PM Minutolo, Lorenzo <[email protected]> wrote: > Thanks everyone for this thread, it's very helpful. > Underruns occur even with top spec hardware on the host side, and my > application is very susceptible to streaming errors, hence, DPDK . > > I'm still trying to get DPDK working, and I'm stuck with: > > sudo uhd_usrp_probe --args="use_dpdk=1,type=n3xx,addr=192.168.10.2" > [INFO] [UHD] linux; GNU C++ version 7.4.0; Boost_106600; > UHD_4.0.0.0-154-gb061af4f > EAL: Detected 16 lcore(s) > EAL: Detected 1 NUMA nodes > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket > EAL: No free hugepages reported in hugepages-1048576kB > EAL: Probing VFIO support... > EAL: VFIO support initialized > *[ERROR] [DPDK] No available DPDK devices (ports) found!* > [ERROR] [UHD] Device discovery error: RuntimeError: No available DPDK > devices (ports) found! > Error: LookupError: KeyError: No devices found for -----> > Device Address: > use_dpdk: 1 > type: n3xx > addr: 192.168.10.2 > > > I do have a folder with only the dpdk libs loaded and I managed to bind > the devices to vfio-pci: > > Network devices using DPDK-compatible driver > ============================================ > 0000:02:00.0 'Ethernet Controller X710 for 10GbE SFP+ 1572' drv=vfio-pci > unused=i40e > 0000:02:00.1 'Ethernet Controller X710 for 10GbE SFP+ 1572' drv=vfio-pci > unused=i40e > > Network devices using kernel driver > =================================== > 0000:00:14.3 'Wireless-AC 9560 [Jefferson Peak] a370' if=wlo1 drv=iwlwifi > unused=vfio-pci > 0000:00:1f.6 'Ethernet Connection (7) I219-V 15bc' if=eno2 drv=e1000e > unused=vfio-pci *Active* > 0000:02:00.2 'Ethernet Controller X710 for 10GbE SFP+ 1572' if=enp2s0f2 > drv=i40e unused=vfio-pci > 0000:02:00.3 'Ethernet Controller X710 for 10GbE SFP+ 1572' if=enp2s0f3 > drv=i40e unused=vfio-pci > > My conf file looks like: > > [use_dpdk=1] > dpdk_mtu=9000 > dpdk_driver=/usr/local/lib/dpdk_select > dpdk_corelist=10,11,12,13 > dpdk_num_mbufs=4096 > dpdk_mbuf_cache_size=512 > > [dpdk_mac=***firts mac addr***] > dpdk_lcore = 10 > dpdk_ipv4 = 192.168.10.2/24 > dpdk_num_desc=4096 > > [dpdk_mac=***second mac addr***] > dpdk_lcore = 11 > dpdk_ipv4 = 192.168.20.2/24 > dpdk_num_desc=4096 > > > Anyone has a hint on what could be going wrong? > > Thanks, > Lorenzo > > ------------------------------ > *From:* USRP-users <[email protected]> on behalf of > Patrick Kane via USRP-users <[email protected]> > *Sent:* Wednesday, February 3, 2021 2:28 PM > *To:* Rob Kossler <[email protected]> > *Cc:* usrp-users <[email protected]> > *Subject:* Re: [USRP-users] DPDK troubles (invalid ELF header loading > dpdk library) > > Hi Rob, > > Thanks for documenting your steps. I can confirm most if not all of your > problems on Centos 7, USRP N321, Intel XL710. @Ettus can we get some > attention for this issue? DPDK is marketted as a huge improvement for max > bandwidth applications, and I have failed to see any real testing or use > cases of it working more than once in a row. It is certainly a barrier for > my applications, forcing me to reduce the sample rate and simplify the use > cases. > > -Pat > > > On Wed, Feb 3, 2021 at 4:53 PM Rob Kossler via USRP-users < > [email protected]> wrote: > > I am now to the point where things are kind of working and I'm basically > giving up trying to make them better. A few remarks for anyone who tries > DPDK in the future (with N310, Ubuntu 20.04, Intel XL710 NIC, and UHD 4.0). > > 1) I can only get my application to run once and then I have to do some > stuff (see NOTE 1 below) to run again. > 2) I get occasional (but much too often) lock-ups of other applications > running in Ubuntu. This was previously my experience using DPDK under 3.15 > (DPDK 17.11) but I had hoped things were better now. They are not. See > below for more details (NOTE 2 below) on this. Note that these lockups do > not occur even occasionally when not running with DPDK. > 3) The instructions in the UHD manual are not nearly good enough to get > things running. > 4) I first got things working as "root" (as recommended), but this caused > some ancillary issues with my apps. Fortunately, I was able to get it to > run as a lowly user (see NOTE 3 below) > 5) I could not get things working even once until I followed Aaron's > advice of putting just a few symlinks in a folder and pointing to that > folder from .config/uhd.conf (dpdk_driver=<folder>). See NOTE 4 below. > > Read on for the details if interested. > Rob > > NOTE 1: After I run and exit my app, I notice that the link LEDs on the > SFP ports of the N310 are not both on as they should be and I am unable to > run a second time. The following sequence fixes this (perhaps there is a > better sequence but I haven't found it yet) such that I am able to re-run > successfully. > - sudo dpdk-devbind -b i40e 03:00.0 03.00.1 # bind normal driver > - sudo dpdk-devbind -b vfio-pci 03:00.0 03.00.1 # re-bind vfio-pci driver > - physically, unplug & plug QSFP+ transceiver on XL710 (sometimes have to > do this 2 or 3 times before it "fixes" the link LEDs on N310 SFPs) > > NOTE 2: The fact that DPDK takes over the CPU cores (at least 1 if not 2 > of them) seems to cause issues with other apps. In the past I have even > had issues with keyboard/mouse input that became intolerably slow. I didn't > have keyboard/mouse issues this time, but I did have issues with a > companion application that I run alongside my c++/UHD application. This > companion application (actually Matlab based control/display GUI) would > lock up such that I couldn't even close it down. But, once I stop my > c++/UHD application, everything starts behaving normally. Note that I > NEVER have this issue when running the same applications without DPDK. I > tried the grub update "isolcpus=N,M" but not sure if this helped or not. I > also tried changing my DPDK corelist from 0,1 to 6,7 because in the past I > had convinced myself (perhaps wrongly) that things behaved better if not > using CPU 0. I have no hard evidence to support this. In the end, things > mostly work, but these lockups are reason enough to avoid DPDK. > > NOTE 3: I did the following to run as lowly user rather than root. > 1) updated /etc/security/limits.conf to use the following. I really have > no idea if these are reasonable values or not. The DPDK docs indicated that > these are the relevant settings to adjust but gave no advice on what they > should be set to. > <username> - memlock 2000000 > <username> - nofile 2000 > <username> - locks 2000 > 2) after binding the vfio-pci driver using dpdk-devbind, I ran the > following. The first two are commands I determined after running the DPDK > usertools/dpdk-setup.sh utility and then looking at the source to see the > exact chmod settings used by this utility (BTW, this utility was helpful). > The third was recommended in the DPDK documentation. > sudo chmod a+x /dev/vfio > sudo chmod 0666 /dev/vfio/* > sudo chmod a+w /dev/hugepages/ > > NOTE 4: The following are the few symlinks I put in a folder I created > "/usr/local/lib/dpdk-pmds/". After pointing the dpdk_driver=<folder> > setting in uhd.conf to this, I was able to run successfully. > librte_mempool_ring.so, librte_pmd_i40e.so, librte_pmd_ixgbe.so, and > librte_pmd_ring.so. > > On Wed, Feb 3, 2021 at 10:44 AM Rob Kossler <[email protected]> wrote: > > Hi Aaron, > Two things: > 1) I am getting an error message at the conclusion of a successful run > (see below). Not sure if this is something I should be looking at or if it > is harmless. > 2) I figured out a sequence of steps that can "fix" my broken state > following a successful run. If I do the following, the links are fixed: > a) dpdk-devbind -b i40e 03:00.0 03:00.1 // bind to the normal driver > b) dpdk-devbind -b vfio-pci 03:00.0 03:00.1 // bind back to the > vfio-pci driver > c) physically unplug & plug the XL710 QSFP+ transceiver (mine is > optical, but unplugging just the MTP does not do the trick - I need to > unplug the full transceiver) > > Once I complete the sequence above, the link LEDs are back to normal and I > can complete another run of benchmark_rate. This is obviously a bad > solution so if you have any ideas, please let me know. > Rob > > [00:00:05.113788990] Testing receive rate 125.000000 Msps on 4 channels > [00:00:05.120454627] Testing transmit rate 125.000000 Msps on 4 channels > [00:00:15.373972384] Benchmark complete. > > Benchmark rate summary: > Num received samples: 5099558824 > Num dropped samples: 0 > Num overruns detected: 0 > Num transmitted samples: 4999335588 > Num sequence errors (Tx): 0 > Num sequence errors (Rx): 0 > Num underruns detected: 0 > Num late commands: 0 > Num timeouts (Tx): 0 > Num timeouts (Rx): 0 > > > Done! > > i40e_phy_conf_link(): Failed to get PHY capabilities: -7 > > > On Wed, Feb 3, 2021 at 10:16 AM Rob Kossler <[email protected]> wrote: > > Hi Aaron, > Unfortunately, I already tried playing around with the link timeout > increasing up to 10 seconds. No luck. But, I am presently troubleshooting > the issue and trying to switch back and forth between DPDK and normal > networking. I am finding that normal networking is not working after 1 run > of DPDK. And, I'm noticing that link LEDs are messed up and normal pings > are not working. I am playing around with disconnecting / reconnecting > links in order to get the link LEDs back to normal. My guess is that > things are not cleaning up as they should. > Rob > > On Wed, Feb 3, 2021 at 9:51 AM Aaron Rossetto via USRP-users < > [email protected]> wrote: > > I notice in the second and subsequent runs, you get this message from UHD: > > [ERROR] [DPDK] All DPDK links did not report as up! > > One of the other issues I've noticed with DPDK (and unfortunately > don't have an answer for) is that link detection seems to have issues. > I'm not sure if this is an XL710-specific problem or whether it's more > widespread, but I added some code to try to mitigate things somewhat > in commit eada49e4d. This commit checks the link status at > 250-millisecond intervals for up to the link status timeout (default 1 > second) in case the links take a while to register as up. One thing > you could try is overriding the default link status timeout and > increasing the value, which you can do by adding a dpdk_link_timeout=X > line to the [use_dpdk=1] section of your uhd.conf file, where X is the > new timeout in number of milliseconds. > > Best regards, > Aaron > > On Tue, Feb 2, 2021 at 1:47 PM Rob Kossler <[email protected]> wrote: > > > > Hi Aaron, > > This did indeed help. Now I am able to run ONCE successfully. After > that I get an error. Same behavior on both systems. Not yet sure how to > clear the error. I played with dpdk_link_timeout and even tried resetting > the N310 using "overlay rm n310 && overlay add n310 && systemctl restart > usrp-hwd". But no luck. > > Rob > > > > // First run succeeds > > root@irisheyes5-hp-z240-sff:~# uhd_image_loader > --args="addr=192.168.1.88,type=n3xx,fpga=XG" > > [INFO] [UHD] linux; GNU C++ version 9.3.0; Boost_107100; > UHD_4.0.0.0-50-ge520e3ff > > [INFO] [MPMD] Initializing 1 device(s) in parallel with args: > mgmt_addr=192.168.1.88,type=n3xx,product=n310,serial=3144673,claimed=False,skip_init=1 > > [WARNING] [MPM.RPCServer] A timeout event occured! > > [INFO] [MPMD] Claimed device without full initialization. > > [INFO] [MPMD IMAGE LOADER] Starting update. This may take a while. > > [INFO] [MPM.PeriphManager] Updating component `fpga' > > [INFO] [MPM.PeriphManager] Updating component `dts' > > [INFO] [MPM.RPCServer] Resetting peripheral manager. > > [INFO] [MPM.PeriphManager] Device serial number: 3144673 > > [INFO] [MPM.PeriphManager] Initialized 2 daughterboard(s). > > [INFO] [MPM.PeriphManager] init() called with device args > `clock_source=internal,time_source=internal'. > > [INFO] [MPMD IMAGE LOADER] Update component function succeeded. > > root@irisheyes5-hp-z240-sff:~# benchmark_rate --tx_rate=62.5e6 > --rx_rate=62.5e6 --channels="0,1,2,3" > --args="use_dpdk=1,mgmt_addr=192.168.1.88,addr=192.168.60.2" > > > > [INFO] [UHD] linux; GNU C++ version 9.3.0; Boost_107100; > UHD_4.0.0.0-50-ge520e3ff > > EAL: Detected 8 lcore(s) > > EAL: Detected 1 NUMA nodes > > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket > > EAL: No free hugepages reported in hugepages-1048576kB > > EAL: Probing VFIO support... > > EAL: VFIO support initialized > > EAL: PCI device 0000:03:00.0 on NUMA socket -1 > > EAL: Invalid NUMA socket, default to 0 > > EAL: probe driver: 8086:1584 net_i40e > > EAL: using IOMMU type 1 (Type 1) > > EAL: PCI device 0000:03:00.1 on NUMA socket -1 > > EAL: Invalid NUMA socket, default to 0 > > EAL: probe driver: 8086:1584 net_i40e > > EAL: PCI device 0000:03:00.2 on NUMA socket -1 > > EAL: Invalid NUMA socket, default to 0 > > EAL: probe driver: 8086:1584 net_i40e > > EAL: PCI device 0000:03:00.3 on NUMA socket -1 > > EAL: Invalid NUMA socket, default to 0 > > EAL: probe driver: 8086:1584 net_i40e > > [00:00:00.000152] Creating the usrp device with: > use_dpdk=1,mgmt_addr=192.168.1.88,addr=192.168.60.2... > > [INFO] [MPMD] Initializing 1 device(s) in parallel with args: > mgmt_addr=192.168.1.88,type=n3xx,product=n310,serial=3144673,claimed=False,use_dpdk=1,addr=192.168.60.2 > > [INFO] [MPM.PeriphManager] init() called with device args > `mgmt_addr=192.168.1.88,product=n310,use_dpdk=1,clock_source=internal,time_source=internal'. > > Using Device: Single USRP: > > Device: N300-Series Device > > Mboard 0: n310 > > RX Channel: 0 > > RX DSP: 0 > > RX Dboard: A > > RX Subdev: Magnesium > > RX Channel: 1 > > RX DSP: 1 > > RX Dboard: A > > RX Subdev: Magnesium > > RX Channel: 2 > > RX DSP: 2 > > RX Dboard: B > > RX Subdev: Magnesium > > RX Channel: 3 > > RX DSP: 3 > > RX Dboard: B > > RX Subdev: Magnesium > > TX Channel: 0 > > TX DSP: 0 > > TX Dboard: A > > TX Subdev: Magnesium > > TX Channel: 1 > > TX DSP: 1 > > TX Dboard: A > > TX Subdev: Magnesium > > TX Channel: 2 > > TX DSP: 2 > > TX Dboard: B > > TX Subdev: Magnesium > > TX Channel: 3 > > TX DSP: 3 > > TX Dboard: B > > TX Subdev: Magnesium > > > > [00:00:03.21715319] Setting device timestamp to 0... > > [INFO] [MULTI_USRP] 1) catch time transition at pps edge > > [INFO] [MULTI_USRP] 2) set times next pps (synchronously) > > [WARNING] [0/Radio#0] Attempting to set tick rate to 0. Skipping. > > [WARNING] [0/Radio#1] Attempting to set tick rate to 0. Skipping. > > [WARNING] [0/Radio#1] Attempting to set tick rate to 0. Skipping. > > [WARNING] [0/Radio#0] Attempting to set tick rate to 0. Skipping. > > Setting TX spp to 1989 > > [00:00:04.907401082] Testing receive rate 62.500000 Msps on 4 channels > > [00:00:04.914615576] Testing transmit rate 62.500000 Msps on 4 channels > > [00:00:15.167869894] Benchmark complete. > > > > > > Benchmark rate summary: > > Num received samples: 2549794336 > > Num dropped samples: 0 > > Num overruns detected: 0 > > Num transmitted samples: 2499910452 > > Num sequence errors (Tx): 0 > > Num sequence errors (Rx): 0 > > Num underruns detected: 0 > > Num late commands: 0 > > Num timeouts (Tx): 0 > > Num timeouts (Rx): 0 > > > > > > Done! > > > > // Second run fails > > root@irisheyes5-hp-z240-sff:~# benchmark_rate --tx_rate=62.5e6 > --rx_rate=62.5e6 --channels="0,1,2,3" > --args="use_dpdk=1,mgmt_addr=192.168.1.88,addr=192.168.60.2" > > > > [INFO] [UHD] linux; GNU C++ version 9.3.0; Boost_107100; > UHD_4.0.0.0-50-ge520e3ff > > EAL: Detected 8 lcore(s) > > EAL: Detected 1 NUMA nodes > > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket > > EAL: No free hugepages reported in hugepages-1048576kB > > EAL: Probing VFIO support... > > EAL: VFIO support initialized > > EAL: PCI device 0000:03:00.0 on NUMA socket -1 > > EAL: Invalid NUMA socket, default to 0 > > EAL: probe driver: 8086:1584 net_i40e > > EAL: using IOMMU type 1 (Type 1) > > EAL: PCI device 0000:03:00.1 on NUMA socket -1 > > EAL: Invalid NUMA socket, default to 0 > > EAL: probe driver: 8086:1584 net_i40e > > EAL: PCI device 0000:03:00.2 on NUMA socket -1 > > EAL: Invalid NUMA socket, default to 0 > > EAL: probe driver: 8086:1584 net_i40e > > EAL: PCI device 0000:03:00.3 on NUMA socket -1 > > EAL: Invalid NUMA socket, default to 0 > > EAL: probe driver: 8086:1584 net_i40e > > [ERROR] [DPDK] All DPDK links did not report as up! > > EAL: FATAL: already called initialization. > > EAL: already called initialization. > > [ERROR] [UHD] Device discovery error: RuntimeError: DPDK: All DPDK links > did not report as up! > > [ERROR] [DPDK] Error with EAL initialization > > [ERROR] [X300] X300 Network discovery error RuntimeError: Error with EAL > initialization > > [00:00:00.000122] Creating the usrp device with: > use_dpdk=1,mgmt_addr=192.168.1.88,addr=192.168.60.2... > > EAL: FATAL: already called initialization. > > EAL: already called initialization. > > [ERROR] [DPDK] Error with EAL initialization > > [ERROR] [UHD] Device discovery error: RuntimeError: Error with EAL > initialization > > EAL: FATAL: already called initialization. > > EAL: already called initialization. > > [ERROR] [DPDK] Error with EAL initialization > > [ERROR] [X300] X300 Network discovery error RuntimeError: Error with EAL > initialization > > Error: LookupError: KeyError: No devices found for -----> > > Device Address: > > use_dpdk: 1 > > mgmt_addr: 192.168.1.88 > > addr: 192.168.60.2 > > > > // Third run fails > > root@irisheyes5-hp-z240-sff:~# benchmark_rate --tx_rate=62.5e6 > --rx_rate=62.5e6 --channels="0,1,2,3" > --args="use_dpdk=1,mgmt_addr=192.168.1.88,addr=192.168.60.2" > > > > [INFO] [UHD] linux; GNU C++ version 9.3.0; Boost_107100; > UHD_4.0.0.0-50-ge520e3ff > > EAL: Detected 8 lcore(s) > > EAL: Detected 1 NUMA nodes > > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket > > EAL: No free hugepages reported in hugepages-1048576kB > > EAL: Probing VFIO support... > > EAL: VFIO support initialized > > EAL: PCI device 0000:03:00.0 on NUMA socket -1 > > EAL: Invalid NUMA socket, default to 0 > > EAL: probe driver: 8086:1584 net_i40e > > EAL: using IOMMU type 1 (Type 1) > > EAL: PCI device 0000:03:00.1 on NUMA socket -1 > > EAL: Invalid NUMA socket, default to 0 > > EAL: probe driver: 8086:1584 net_i40e > > EAL: PCI device 0000:03:00.2 on NUMA socket -1 > > EAL: Invalid NUMA socket, default to 0 > > EAL: probe driver: 8086:1584 net_i40e > > EAL: PCI device 0000:03:00.3 on NUMA socket -1 > > EAL: Invalid NUMA socket, default to 0 > > EAL: probe driver: 8086:1584 net_i40e > > [ERROR] [DPDK] All DPDK links did not report as up! > > EAL: FATAL: already called initialization. > > EAL: already called initialization. > > [ERROR] [UHD] Device discovery error: RuntimeError: DPDK: All DPDK links > did not report as up! > > [ERROR] [DPDK] Error with EAL initialization > > [ERROR] [X300] X300 Network discovery error RuntimeError: Error with EAL > initialization > > [00:00:00.000148] Creating the usrp device with: > use_dpdk=1,mgmt_addr=192.168.1.88,addr=192.168.60.2... > > EAL: FATAL: already called initialization. > > EAL: already called initialization. > > [ERROR] [DPDK] Error with EAL initialization > > [ERROR] [UHD] Device discovery error: RuntimeError: Error with EAL > initialization > > EAL: FATAL: already called initialization. > > EAL: already called initialization. > > [ERROR] [DPDK] Error with EAL initialization > > [ERROR] [X300] X300 Network discovery error RuntimeError: Error with EAL > initialization > > Error: LookupError: KeyError: No devices found for -----> > > Device Address: > > use_dpdk: 1 > > mgmt_addr: 192.168.1.88 > > addr: 192.168.60.2 > > > > > > > > On Tue, Feb 2, 2021 at 11:53 AM Aaron Rossetto via USRP-users < > [email protected]> wrote: > >> > >> On Mon, Feb 1, 2021 at 9:02 PM Rob Kossler via USRP-users > >> <[email protected]> wrote: > >> > >> > Has anyone successfully used DPDK with Ubuntu 20.04, UHD 4.0, Intel > XL710 NIC, and N310 (or X310)? > >> > >> If I remember correctly, I believe DPDK tries to dlopen() *everything* > >> in the directory specified by the dpdk_driver parameter in the DPDK > >> section of uhd.conf, leading to a lot of errors similar to yours > >> ('Invalid ELF header' and the like). Having the correct collection of > >> .so files in that directory is key. > >> > >> What's worked for me in the past when using DPDK with an Intel XL710 > >> is creating a directory (I used /usr/local/lib/dpdk-pmds) and copying > >> a specific set of DPDK .so files into this directory: > >> * librte_mempool_ring.so > >> * librte_pdump.so (I think this one is optional--I had been trying to > >> get packet dumps from DPDK a while back) > >> * librte_pmd_i40e.so > >> * librte_pmd_ixgbe.so (may be optional?) > >> * librte_pmd_pcap.so (this one is also optional, I think) > >> * librte_pmd_ring.so > >> > >> (Symlinking to the actual libraries wherever they get installed > >> instead of copying them into the directory would probably work as > >> well.) > >> > >> Then, make sure that the dpdk-driver key in the [use_dpdk=1] section > >> of uhd.conf points to that directory: > >> dpdk_driver = /usr/local/lib/dpdk-pmds > >> > >> Hopefully that will resolve the issue and get you a little further > >> down the road. > >> > >> Best regards, > >> Aaron > >> > >> _______________________________________________ > >> USRP-users mailing list > >> [email protected] > >> http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com > > _______________________________________________ > USRP-users mailing list > [email protected] > http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com > > _______________________________________________ > USRP-users mailing list > [email protected] > http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com > > _______________________________________________ > USRP-users mailing list -- [email protected] > To unsubscribe send an email to [email protected] >
_______________________________________________ USRP-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
