We are closing this LP bug for now as we aren't able to reproduce in-house, and we cannot get access to a live testing repro env at this time.
Here is what we know: - There seems to be different performance for some tests when the NIC is configured with active-backup bonding mode, between the case when the active interface is the primary port, and when the active interface is the secondary port. i.e.: Primary port: enp94s0f0 // when this is the active, works fine Secondary port: enp94s0f1d1 // when this is the active, more drops - Switch info: 2 x Fortigate 1024D switches, each machine is connected to both - NIC info: root@u072:~# lspci | grep BCM57416 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01) # ethtool -i enp1s0f0np0 driver: bnxt_en version: 1.10.0 firmware-version: 214.0.253.1/pkg 21.40.25.31 - Our attempt at a reproducer (initially reported in production env via graphical monitoring): mtr --no-dns --report --report-cycles 60 --udp -s 1428 $DEST good system = ~ 0% drops bad systems = ~ 8% drops We are not getting NIC stats drops, nor UDP kernel drops, so it's not clear where the packet is being dropped, whether it's being dropped silently somewhere (?), or if that's a red herring and a mtr test issue, and what's seen in production is something else. If someone can reproduce this, or something similar, or if we manage to, we will re-open this bug or file a new one. -- You received this bug notification because you are a member of Desktop Packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.000007] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D00000000000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD100000000001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD100000000001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0000000000000) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0000000000000) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0000000000000) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0000000000000) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.000000 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.000000 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected: 0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands: 0 Num timeouts (Tx): 0 Num timeouts (Rx): 0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try determine the problem: https://files.ettus.com/manual/page_usrp_x3x0_config.html - There is no firewall on that port (disabled). - I tried setting the cpu frequency power but got "no or unknown cpufreq driver is active on this CPU". - I also changed the cable to Cat6a connecting the USRPs to the 10G SRIOV port, and I get the same issue This is from the VM with connected USRP x310 tcdforge@x310a:~$ lspci -nn | grep -i ethernet 00:03.0 Ethernet controller [0200]: Red Hat, Inc. Virtio network device [1af4:1000] 00:05.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme-E Ethernet Virtual Function [14e4:16dc] tcdforge@x310a:~$ 5e:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller [14e4:16d8] (rev 01) Subsystem: Dell BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller [1028:1fea] Flags: bus master, fast devsel, latency 0, IRQ 50, NUMA node 0 Memory at b9a10000 (64-bit, prefetchable) [size=64K] Memory at b9100000 (64-bit, prefetchable) [size=1M] Memory at b9aa2000 (64-bit, prefetchable) [size=8K] Expansion ROM at b9c00000 [disabled] [size=512K] Capabilities: <access denied> Kernel driver in use: bnxt_en Kernel modules: bnxt_en We get this info from the server: scamallra@rack9:~$ cpupower frequency-info analyzing CPU 0: no or unknown cpufreq driver is active on this CPU CPUs which run at the same hardware frequency: Not Available CPUs which need to have their frequency coordinated by software: Not Available maximum transition latency: Cannot determine or is not supported. Not Available available cpufreq governors: Not Available Unable to determine current policy current CPU frequency: Unable to call hardware current CPU frequency: Unable to call to kernel boost state support: Supported: yes Active: yes lsb_release -rd Description: Ubuntu 18.04.3 LTS Release: 18.04 ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: network-manager 1.10.6-2ubuntu1.1 ProcVersionSignature: Ubuntu 4.15.0-70.79-generic 4.15.18 Uname: Linux 4.15.0-70-generic x86_64 ApportVersion: 2.20.9-0ubuntu7.9 Architecture: amd64 Date: Fri Nov 22 17:39:21 2019 NetworkManager.state: [main] NetworkingEnabled=true WirelessEnabled=true WWANEnabled=true ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=en_US.UTF-8 SHELL=/bin/bash RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' SourcePackage: network-manager UpgradeStatus: No upgrade log present (probably fresh install) nmcli-con: NAME UUID TYPE TIMESTAMP TIMESTAMP-REAL AUTOCONNECT AUTOCONNECT-PRIORITY READONLY DBUS-PATH ACTIVE DEVICE STATE ACTIVE-PATH SLAVE nmcli-nm: RUNNING VERSION STATE STARTUP CONNECTIVITY NETWORKING WIFI-HW WIFI WWAN-HW WWAN running 1.10.6 connected started unknown enabled enabled enabled enabled enabled To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853638/+subscriptions -- Mailing list: https://launchpad.net/~desktop-packages Post to : [email protected] Unsubscribe : https://launchpad.net/~desktop-packages More help : https://help.launchpad.net/ListHelp

