[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
I am having the same issue. Unfortunately the systems are already in production so testing is a bit limited and the issue was not present at first (not sure why or we didn't notice somehow). I can confirm that I am seeing approx. 8% loss and huge latency > 1s sometimes. Both tested with Ubuntu 18.04 and Ubuntu 20.04 with standard kernel and Libvirt/Openstack VMs. No difference. Bond with LACP, bond first NIC primary and bond second NIC primary makes absolutely no difference. Changed almost all NIC parameters (offloading, ring size, queues) etc., still same result. My last bet was to completely remove the bond and just use the first NIC interface. That worked and now I have stable latency and no loss. Don't know what exactly causes this because we already replaced the NIC with Mellanox and even had the same issue, so I am not sure if this is an issue that happens when the BCM NIC is present but NOT USED somehow (was installed on mainboard). But I can't do further testing now. Removing the bond was my solution as @niveditasinghvi solution did not work for me. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transm
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
The issue we have reported is easily avoided by specifying the primary port to be the active interface of the bond. On netplan-using systems: Add the directive "primary: $interface" (e.g. "primary: p94s0f0") to the "parameters:" section of the netplan config file. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Hello, diarmuid, Re: original issue report, were you able to resolve your issue? Please let us know. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try determine the problem: https://files.ettus.com/manual/page_usrp_x3x0_config.html - There is no firew
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
We are closing this LP bug for now as we aren't able to reproduce in-house, and we cannot get access to a live testing repro env at this time. Here is what we know: - There seems to be different performance for some tests when the NIC is configured with active-backup bonding mode, between the case when the active interface is the primary port, and when the active interface is the secondary port. i.e.: Primary port: enp94s0f0 // when this is the active, works fine Secondary port: enp94s0f1d1 // when this is the active, more drops - Switch info: 2 x Fortigate 1024D switches, each machine is connected to both - NIC info: root@u072:~# lspci | grep BCM57416 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01) # ethtool -i enp1s0f0np0 driver: bnxt_en version: 1.10.0 firmware-version: 214.0.253.1/pkg 21.40.25.31 - Our attempt at a reproducer (initially reported in production env via graphical monitoring): mtr --no-dns --report --report-cycles 60 --udp -s 1428 $DEST good system = ~ 0% drops bad systems = ~ 8% drops We are not getting NIC stats drops, nor UDP kernel drops, so it's not clear where the packet is being dropped, whether it's being dropped silently somewhere (?), or if that's a red herring and a mtr test issue, and what's seen in production is something else. If someone can reproduce this, or something similar, or if we manage to, we will re-open this bug or file a new one. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Regarding your question about LLDP and IPv6... The default Ubuntu 18.04.3 configuration has an IPv6 enabled kernel, but the interface only has the default link local address configured. I've seen it do router solicitation on link state changes and periodically thereafter. I think I recall seeing somewhere above that you were going to try without IPv6. Have you seen different behavior with IPv6 disabled? I don't expect to see LLDP because I have the two NICs wired directly to each other, back to back. I'm not running any LLDP daemon on the host. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
The tpa_aborts shouldn't be a concern. They merely indicate that a TCP flow could not be aggregated. That could have a performance impact, of course, but that should manifest as counted drops somewhere if this were the case. Importantly, the tpa_aborts only apply to TCP traffic, but you see the problem for ICMP and UDP too. Note, the tpa_aborts also appear to be evident on the primary as active interface while things are working as expected. A difference in magnitude tpa_aborts from one test run to another may be a clue about something else that's happening though, but I'm not sure that we are comparing apples to apples with respect the ethtool -S dumps posted thus far (when were they captured relative to the test runs, which interface was active at the time, etc?). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particula
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Edwin, Do you happen to notice any IPv6 or LLDP or other link-local traffic on the interfaces? (including backup interface). The MTR loss % is purely a capture of their packets xmitted and responses received, so for that UDP MTR test, this is saying that UDP packets were lost, somewhere. The NIC does not have any drops showing via ethtool -S stats but I'm hunting down which are the right pair of before/afters. Other than the tpa_abort counts, there were no errors that I saw. I can't tell what the tpa_abort means for the frame - is it purely a failure only to coalesce, or does it end up dropping packets at some point in that functionality? I'm assuming not, as whatever the reason, those would be counted as drops, I hope, and printed in the interface stats. I'll attach all the stats here once I get them sorted out, I thought I had a clean diff of before and after from the tester, but after looking through, I don't think the file I have is from before/after the mtr test, as there was negligible UDP traffic. I'll try and get clarification from the reporter. Note that when the provision of primary= is used to configure which interface is primary, and when the primary port is used as the active interface for the bond, no problems are seen (and that works deterministically to set the correct active interface). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
I have tried, unsuccessfully, to reproduce this issue internally. Details of my setup below. 1) I have a pair of Dell R210 servers racked (u072 and u073 below), each with a BCM57416 installed: root@u072:~# lspci | grep BCM57416 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01) 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01) 2) I've matched the firmware version to one that Nivedita reported in a bad system: root@u072:~# ethtool -i enp1s0f0np0 driver: bnxt_en version: 1.10.0 firmware-version: 214.0.253.1/pkg 21.40.25.31 expansion-rom-version: bus-info: :01:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: no supports-priv-flags: no 3) Matched Ubuntu release and kernel version: root@u072:~# lsb_release -dr Description:Ubuntu 18.04.3 LTS Release:18.04 root@u072:~# uname -a Linux u072 5.0.0-37-generic #40~18.04.1-Ubuntu SMP Thu Nov 14 12:06:39 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux 4) Configured the interface into an active-backup bond: root@u072:~# cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: enp1s0f1np1 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: enp1s0f1np1 MII Status: up Speed: 1 Mbps Duplex: full Link Failure Count: 1 Permanent HW addr: 00:0a:f7:a7:10:61 Slave queue ID: 0 Slave Interface: enp1s0f0np0 MII Status: up Speed: 1 Mbps Duplex: full Link Failure Count: 1 Permanent HW addr: 00:0a:f7:a7:10:60 Slave queue ID: 0 5) Run the provided mtr and netperf test cases with the 1st port selected as active: root@u072:~# ip l set enp1s0f1np1 down root@u072:~# ip l set enp1s0f1np1 up root@u072:~# cat /proc/net/bonding/bond0 | grep Active Currently Active Slave: enp1s0f0np0 a) initiated on u072: root@u072:~# mtr --no-dns --report --report-cycles 60 192.168.1.2 Start: 2020-02-13T20:48:01+ HOST: u072Loss% Snt Last Avg Best Wrst StDev 1.|-- 192.168.1.20.0%600.2 0.2 0.2 0.2 0.0 root@u072:~# netperf -t TCP_RR -H 192.168.1.2 -- -r 1,1 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.2 () port 0 AF_INET : demo : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size SizeTime Rate bytes Bytes bytesbytes secs.per sec 16384 131072 11 10.0029040.91 16384 87380 root@u072:~# netperf -t TCP_RR -H 192.168.1.2 -- -r 64,64 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.2 () port 0 AF_INET : demo : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size SizeTime Rate bytes Bytes bytesbytes secs.per sec 16384 131072 64 64 10.0028633.36 16384 87380 root@u072:~# netperf -t TCP_RR -H 192.168.1.2 -- -r 128,8192 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.2 () port 0 AF_INET : demo : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size SizeTime Rate bytes Bytes bytesbytes secs.per sec 16384 131072 128 819210.0017469.30 16384 87380 b) initiated on u073: root@u073:~# mtr --no-dns --report --report-cycles 60 192.168.1.1 Start: 2020-02-13T20:53:37+ HOST: u073Loss% Snt Last Avg Best Wrst StDev 1.|-- 192.168.1.10.0%600.1 0.1 0.1 0.2 0.0 root@u073:~# netperf -t TCP_RR -H 192.168.1.1 -- -r 1,1 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET : demo : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size SizeTime Rate bytes Bytes bytesbytes secs.per sec 16384 87380 11 10.0028514.93 16384 131072 root@u073:~# netperf -t TCP_RR -H 192.168.1.1 -- -r 64,64 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET : demo : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size SizeTime Rate bytes Bytes bytesbytes secs.per sec 16384 87380 64 64 10.0027405.88 16384 131072 root@u073:~# netperf -t TCP_RR -H 192.168.1.1 -- -r 128,8192 MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET : demo : first burst 0 Local /Remote Socket Size Request Resp. Elapsed Trans. Send Recv Size SizeTime Rate b
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Additional observations. MAAS is being used to deploy the system and configure the bond interface and settings. MAAS allows you to specify which is the primary interface, with the other being the backup, for the active-backup bonding mode. However, it does not appear to be working -it's not passing along a primary primitive, for instance, in the netplan yaml or otherwise resulting in this being honored (still need to confirm). MAAS allows you to enter a mac address for the bond interface, but if not supplied, by default it will use the mac address of the "primary" interface, as configured. MAAS then populates the /etc/netplan/50-cloud-init.yaml, including a macaddr= line with the default. netplan then passes that along to systemd-networkd. The bonding kernel, however, will use as the active interface whichever interface is first attached to the bond (i.e., which completes getting attached to the bond interface first) in the absence of a primary= directive. The bonding kernel will, however, use the mac addr supplied as an override. So let's say the active interface was configured in MAAS to be f0, and it's mac is used to be the mac address of the bond, but f1 (the second port of the NIC) actually gets attached first to the bond and is used as the active interface by the bond. We have a situation where f0 = backup, f1 = active, and bond0 is using the mac of f0. While this should work, there is a potential for problems depending on the circumstances. It's likely this has nothing to do with our current issue, but here for completeness. Will see if we can test/confirm. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Edwin, let me know if you can get in touch with me via the contact email on my Launchpad page. Thanks for all the help! -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try determine the problem: https://files.ettus.com/manual/page_usrp_x3x0_config.html -
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
** Attachment added: "ethtool -S for inactive interface enp94s0f0" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853638/+attachment/5327556/+files/ethtool-S-enp94s0f0 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try determine the problem: htt
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
ethtool-enp94s0f0 -- Settings for enp94s0f0: Supported ports: [ FIBRE ] Supported link modes: 1baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: 1Mb/s Duplex: Full Port: FIBRE PHYAD: 1 Transceiver: internal Auto-negotiation: off Supports Wake-on: g Wake-on: d Current message level: 0x (0) Link detected: yes ethtool-i-enp94s0f0 -- driver: bnxt_en version: 1.10.0 firmware-version: 214.0.253.1/pkg 21.40.25.31 expansion-rom-version: bus-info: :5e:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: no supports-priv-flags: no ethtool-c-enp94s0f0 - Coalesce parameters for enp94s0f0: Adaptive RX: off TX: off stats-block-usecs: 100 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 10 rx-frames: 15 rx-usecs-irq: 1 rx-frames-irq: 1 tx-usecs: 28 tx-frames: 30 tx-usecs-irq: 2 tx-frames-irq: 2 rx-usecs-low: 0 rx-frame-low: 0 tx-usecs-low: 0 tx-frame-low: 0 rx-usecs-high: 0 rx-frame-high: 0 tx-usecs-high: 0 tx-frame-high: 0 ethtool-g-enp94s0f0 Ring parameters for enp94s0f0: Pre-set maximums: RX: 2047 RX Mini:0 RX Jumbo: 8191 TX: 2047 Current hardware settings: RX: 511 RX Mini:0 RX Jumbo: 2044 TX: 511 ethtool-k-enp94s0f0 - Features for enp94s0f0: rx-checksumming: on tx-checksumming: on tx-checksum-ipv4: on tx-checksum-ip-generic: off [fixed] tx-checksum-ipv6: on tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: off [fixed] tx-tcp-mangleid-segmentation: off tx-tcp6-segmentation: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: on receive-hashing: on highdma: on [fixed] rx-vlan-filter: off [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: on tx-gre-csum-segmentation: on tx-ipxip4-segmentation: on tx-ipxip6-segmentation: off [fixed] tx-udp_tnl-segmentation: on tx-udp_tnl-csum-segmentation: on tx-gso-partial: on tx-sctp-segmentation: off [fixed] tx-esp-segmentation: off [fixed] tx-udp-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: on rx-vlan-stag-hw-parse: on rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] hw-tc-offload: on esp-hw-offload: off [fixed] esp-tx-csum-hw-offload: off [fixed] rx-udp_tunnel-port-offload: on tls-hw-tx-offload: off [fixed] tls-hw-rx-offload: off [fixed] rx-gro-hw: on tls-hw-record: off [fixed] -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12A
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
"Bad" System/NIC: NIC: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller System: Dell Kernel: 5.3.0-28-generic #30~18.04.1-Ubuntu (Note, this issue has been seen on prior kernels as well, upgraded to latest to see if various problems were resolved) Attaching stats/config files from nics from this system (seeing issue). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Good System/Good NIC (all configurations work) Comparison NIC: NetXtreme II BCM57000 10 Gigabit Ethernet QLogic 57000 System: Dell Kernel: 5.0.0-25-generic #26~18.04.1-Ubuntu /proc/net/bonding/bond0 --- Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: enp5s0f1 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: enp5s0f1 MII Status: up Speed: 1 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:00:00:00:73:e2 Slave queue ID: 0 Slave Interface: enp5s0f0 MII Status: up Speed: 1 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:00:00:00:73:e0 Slave queue ID: 0 /etc/netplan/50-cloud-init.yaml network: bonds: bond0: addresses: - 00.00.235.182/25 gateway4: 00.00.235.129 interfaces: - enp5s0f0 - enp5s0f1 macaddress: 00:00:00:00:73:e0 mtu: 9000 nameservers: addresses: - 00.00.235.172 - 00.00.235.171 search: - maas parameters: down-delay: 0 gratuitious-arp: 1 mii-monitor-interval: 100 mode: active-backup transmit-hash-policy: layer2 up-delay: 0 ethernets: ...(snip).. enp5s0f0: match: macaddress: 00:00:00:00:73:e0 mtu: 9000 set-name: enp5s0f0 enp5s0f1: match: macaddress: 00:00:00:00:73:e2 mtu: 9000 set-name: enp5s0f1 version: 2 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
"Bad" Configuration for active-backup mode: $ cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: enp94s0f1d1 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Peer Notification Delay (ms): 0 Slave Interface: enp94s0f1d1 MII Status: up Speed: 1 Mbps Duplex: full Link Failure Count: 2 Permanent HW addr: 4c:d9:8f:48:08:da Slave queue ID: 0 Slave Interface: enp94s0f0 MII Status: up Speed: 1 Mbps Duplex: full Link Failure Count: 2 Permanent HW addr: 4c:d9:8f:48:08:d9 Slave queue ID: 0 --- $ cat uname-rv 5.3.0-28-generic #30~18.04.1-Ubuntu SMP Fri Jan 17 06:14:09 UTC 2020 --- Scrubbed /etc/netplan/50-cloud-init.yaml: network: bonds: bond0: addresses: - 0.0.235.177/25 gateway4: 0.0.235.129 interfaces: - enp94s0f0 - enp94s0f1d1 macaddress: 00:00:00:48:08:00 mtu: 9000 nameservers: addresses: - 0.0.235.171 - 0.0.235.172 search: - maas parameters: down-delay: 0 gratuitious-arp: 1 mii-monitor-interval: 100 mode: active-backup transmit-hash-policy: layer2 up-delay: 0 ethernets: eno1: match: macaddress: 00:00:00:76:6e:ca mtu: 1500 set-name: eno1 eno2: match: macaddress: 00:00:00:76:6e:cb mtu: 1500 set-name: eno2 enp94s0f0: match: macaddress: 00:00:00:48:08:00 mtu: 9000 set-name: enp94s0f0 enp94s0f1d1: match: macaddress: 00:00:00:48:08:da mtu: 9000 set-name: enp94s0f1d1 version: 2 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Hi Nivedita, I have been away on PTO the last week and am picking this up again now. Please could you post the full bonding configuration? Regards, Edwin Peer -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try determine the problem: https://files.ettus.c
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
We have narrowed it down to a flaw in a specific configuration setting on this NIC, so we're comparing the good and bad configurations now. Primary port: enp94s0f0 Secondary port: enp94s0f1d1 A] Good config for fault-tolerance (active-backup) bonding mode: -- Primary port = active interface; Secondary port = backup B] Bad config for fault-tolerance (active-backup) bonding mode: -- Primary port = backup interface; Secondary port = active We are consistently able to reproduce a drop rate difference with UDP pkts, for the above good/bad cases: Good Case UDP MTR Test Result - mtr --no-dns --report --report-cycles 60 --udp -s 1428 $DEST Start: 2020-02-10T10:14:01+ HOST: hostname Loss% Snt Last Avg Best Wrst StDev 1.|-- nn.nn.nnn.nnn 0.0%600.3 0.2 0.2 0.3 0.0 Bad Case UDP MTR Test Result --- mtr --no-dns --report --report-cycles 60 --udp -s 1428 $DEST Start: 2020-02-10T14:10:52+ HOST: hostname Loss% Snt Last Avg Best Wrst StDev 1.|-- nn.nn.nnn.nnn 8.3%600.3 0.3 0.2 0.4 0.0 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] De
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
The second port on the NIC definitely works as the active interface in an active-backup bonding configuration on the other NICs. At the moment, it's only this particular NIC that is seeing this problem that we know of. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on thi
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Hello Edwin, Here is more information on the issue we are seeing wrt dropped packets and other connectivity issues with this NIC. The problem is *only* seen when the second port on the NIC is chosen as the active interface of a active-backup configuration. So on the "bad" system with the interfaces: enp94s0f0 -> when chosen as active, all OK enp94s0f1d1 -> when chosen as active, not OK I'll see if the reporters can confirm that on the "good" systems, there was no problem when the second interface is active. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested th
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Hey Edwin, sorry, I didn't see your last question. I'll try and confirm but I've seen loss in both directions but it's not clear whether that's significant enough or not yet. e.g., TCP traffic is retransmitted, so it could be segments lost while outgoing or acks lost incoming. 4407 retransmitted TCP segments 130 TCP timeouts in stats collected about 5 mins apart - which isn't sufficient a sample size, we're trying to get a new collection of stats, logs using the netperf TCP_RR test. In our case, note, we're more concerned (and have more solid data) of latency issues than dropped packets (which I expect some of with heavy network testing). For example, netperf TCP_RR latency is about 70-78% of the older systems for 1,1 request/response byte sizes as well as 64/64, 100/200, 128/8192 sizes. I'll update here as soon as we have more data from the production environment. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
I don't think bnxt_en exposes the disable_tpa parameter. Be that as it may, I think the tpa_aborts may be a red herring. TPA aggregates TCP flows and you are seeing the issue with ICMP. In which direction(s) of traffic flow do you see the losses? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubl
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
> NICs between systems? Are OS / kernel and driver > versions the same on both systems? Yes, identical distro release, kernel, and most of the software stack (I have not obtained and examined the full sw stack). Configuration of networking settings is also the same. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I h
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Thanks very much for helping on this, Edwin! Please let me know if there's anything specific you need. I'm asking them to disable any IPv6, LLDP traffic in their environment, and retest and collect information again. Also, I'd like to disable tpa, would this be at all useful: modprobe bnx disable_tpa=1 ?? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
> There are more than one variable at play here. > Does the problem follow the NIC if you swap the > NICs between systems? Are OS / kernel and driver > versions the same on both systems? Unfortunately, I've not been able to get them to try permutations or switches, as yet, as this is still a production system/environment. I'll try and obtain more information about it. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly o
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
> The mtr packet loss is an interesting result. What mtr options did you use? Is this a UDP or ICMP test? The mtr command was: mtr --no-dns --report --report-cycles 60 $IP_ADDR so ICMP was going out. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try de
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
> 3. mtr ping test > --- > GoodSystem..0.0% Loss; 0.2 Avg; 0.1 Best, 0.9 Worst, 0.1 StdDev > BadSystem2...11.7% Loss; 0.1 Avg; 0.1 Best, 0.2 Worst, 0.0 StdDev The mtr packet loss is an interesting result. What mtr options did you use? Is this a UDP or ICMP test? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem wit
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
With respect to one of these situations, this is the following system: > Dell PowerEdge R440/0XP8V5, BIOS 2.2.11 06/14/2019 > > Note that a similar system does not have any issues: > > Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.3.4 11/08/2016 > > So the NIC in the "bad" environment is: > > BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01) > Product Name: Broadcom Adv. Dual 10G SFP+ Ethernet > > The NIC in the "good" environment is: > > Broadcom Inc. and subsidiaries NetXtreme II BCM57810 > 10 Gigabit Ethernet [14e4:1006] > Product Name: QLogic 57810 10 Gigabit Ethernet There are more than one variable at play here. Does the problem follow the NIC if you swap the NICs between systems? Are OS / kernel and driver versions the same on both systems? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particul
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Here is the Ettus benchmark tool https://kb.ettus.com/Verifying_the_Operation_of_the_USRP_Using_UHD_and_GNU_Radio You would need an Ettus device to run those tests. I cant test the affected node now as it is in production unfortunately. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
** Attachment added: "active interface ethtool-S" https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1853638/+attachment/5324070/+files/ethtool-S-enp94s0f0 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try determine the problem: https://fi
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
** Attachment added: "backup interface ethtool-S" https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1853638/+attachment/5324071/+files/ethtool-S-enp94s0f1d1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try determine the problem: https://
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Note that iperf was identical whereas netperf and mtr showed up differences (so it's possibly sporadic as well, not continuous) 1. iperf tcp test -- GoodSystem.9.84 Gbits/sec BadSystem18.37 Gbits/sec BadSystem2...9.85 Gbits/sec 2. iperf udp test -- GoodSystem.1.05 Mbits/sec BadSystem2...1.05 Mbits/sec 3. mtr ping test --- GoodSystem..0.0% Loss; 0.2 Avg; 0.1 Best, 0.9 Worst, 0.1 StdDev BadSystem2...11.7% Loss; 0.1 Avg; 0.1 Best, 0.2 Worst, 0.0 StdDev 4. netperf tcp_rr 1/1 bytes GoodSystem..17921.83 t/sec BadSystem1.13912.45 t/sec BadSystem2 5. netperf tcp_rr 64/64 bytes GoodSystem..16987.48 t/sec BadSystem1.13355.93 t/sec BadSystem2 6. netperf tcp_rr 128/8192 bytes --- GoodSystem..2396.45 t/sec BadSystem1.1678.54 t/sec BadSystem2 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted sa
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Hello, Edwin, We have two separate users/customers filing reports, and I can answer for one of them. I'll ask the original poster separately as well to reply. With respect to one of these situations, this is the following system: Dell PowerEdge R440/0XP8V5, BIOS 2.2.11 06/14/2019 Note that a similar system does not have any issues: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.3.4 11/08/2016 So the NIC in the "bad" environment is: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01) Product Name: Broadcom Adv. Dual 10G SFP+ Ethernet The NIC in the "good" environment is: Broadcom Inc. and subsidiaries NetXtreme II BCM57810 10 Gigabit Ethernet [14e4:1006] Product Name: QLogic 57810 10 Gigabit Ethernet I'll have to scrub some files and see what I can attach, apologies, I'll have it here by tmrw. Unfortunately, we don't have an easy reproducer. A single iperf and netperf test (both UDP and TCP) showed identical results from both "good" and "bad" environments. What we have is an identical kernel, network configuration and stack with the "bad" system showing double, triple the latency to the systems from a remote server. I'll have more information for you shortly here regarding the exact k8 cmd. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete.
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
I am an engineer at Broadcom and have been assigned to investigate this issue. To that end, I have a few clarifying questions: 1a) What is the benchmark tool you are using and could you provide a link to where I can get it? b) What kind of network traffic is it sending? 2a) In what units are the data rate parameters "--rx_rate 10e6 --tx_rate 10e6" specified? b) What data rate are you attempting to send? The report notes that the platform can't be the issue at 1G, but are you attempting to utilize 10G? 3) Perhaps stating the obvious here, but has anybody looked into the warning? "[WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected." which is probably related to this error: "EnvironmentError: OSError: error in pthread_setschedparam" 4 a) I am personally unfamiliar with the USRP x310, could you provide some more information about it? Googling for it seems to indicate it is some kind of software defined radio platform? b) Is there a way to get access to one to reproduce and diagnose this issue? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Could you also please dump the ethtool statistics for the NIC? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try determine the problem: https://files.ettus.com/manual/page_usrp_x3x0_config.html - There is no firewall on that port (disabled). - I trie
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
(active interface) > cat ethtool-S-enp94s0f1d1 | grep abort [0]: tpa_aborts: 19775497 [1]: tpa_aborts: 26758635 [2]: tpa_aborts: 12008147 [3]: tpa_aborts: 15829167 [4]: tpa_aborts: 25099500 [5]: tpa_aborts: 3292554 [6]: tpa_aborts: 2863692 [7]: tpa_aborts: 20224692 (backup interface) > cat ethtool-S-enp94s0f0 | grep abort [0]: tpa_aborts: 3158584 [1]: tpa_aborts: 1670319 [2]: tpa_aborts: 1749371 [3]: tpa_aborts: 1454301 [4]: tpa_aborts: 123020 [5]: tpa_aborts: 1403509 [6]: tpa_aborts: 1298383 [7]: tpa_aborts: 1858753 Netted out from previous capture, there were *f0 = 2014 tpa_aborts *d1 = 1118473 tpa_aborts ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed ** Changed in: linux (Ubuntu) Importance: Undecided => Critical -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x3
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
We suspect this is a device (hw/fw) issue, however, not NetworkManager or kernel (driver bnxt_en). I've added the kernel for the driver impact (just in case, for now). This is really to eliminate all other causes and confirm whether it's the device at root cause). NIC Product Name: Broadcom Adv. Dual 10G SFP+ Ethernet 5e:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01) NIC Driver/FW --- driver: bnxt_en version: 1.10.0 firmware-version: 214.0.253.1/pkg 21.40.25.31 expansion-rom-version: bus-info: :5e:00.1 supports-statistics: yes Kernel - 5.0.0-37-generic #40~18.04.1-Ubuntu SMP Thu Nov 14 12:06:39 UTC 2019 (appears to be an issue on all kernel versions) Environment Configuration - active-backup bonding mode (having the active backup up *might* potentially be the problem, but it might just be the device itself). The exact same distro, kernel, applications and configuration works fine with a different NIC (Broadcom 10g bnx2x). There were quite a few total tpa_abort stats counts (1118473) during the duration of a 2 minute iperf test. Hoping to get more information from other users seeing the same issue. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence er
[Kernel-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
I have reports of the same device appearing to drop packets and incur greater number of retransmissions under certain circumstances which we're still trying to nail down. I'm using this bug for now until proven to be a different problem. This is causing issues in a production environment. ** Changed in: network-manager (Ubuntu) Status: New => Confirmed ** Changed in: network-manager (Ubuntu) Importance: Undecided => Critical ** Tags added: sts ** Also affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: New Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we