[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
The issue we have reported is easily avoided by specifying the primary port to be the active interface of the bond. On netplan-using systems: Add the directive "primary: $interface" (e.g. "primary: p94s0f0") to the "parameters:" section of the netplan config file. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Hello, diarmuid, Re: original issue report, were you able to resolve your issue? Please let us know. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try determine the problem:
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
We are closing this LP bug for now as we aren't able to reproduce in-house, and we cannot get access to a live testing repro env at this time. Here is what we know: - There seems to be different performance for some tests when the NIC is configured with active-backup bonding mode, between the case when the active interface is the primary port, and when the active interface is the secondary port. i.e.: Primary port: enp94s0f0 // when this is the active, works fine Secondary port: enp94s0f1d1 // when this is the active, more drops - Switch info: 2 x Fortigate 1024D switches, each machine is connected to both - NIC info: root@u072:~# lspci | grep BCM57416 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01) # ethtool -i enp1s0f0np0 driver: bnxt_en version: 1.10.0 firmware-version: 214.0.253.1/pkg 21.40.25.31 - Our attempt at a reproducer (initially reported in production env via graphical monitoring): mtr --no-dns --report --report-cycles 60 --udp -s 1428 $DEST good system = ~ 0% drops bad systems = ~ 8% drops We are not getting NIC stats drops, nor UDP kernel drops, so it's not clear where the packet is being dropped, whether it's being dropped silently somewhere (?), or if that's a red herring and a mtr test issue, and what's seen in production is something else. If someone can reproduce this, or something similar, or if we manage to, we will re-open this bug or file a new one. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Edwin, Do you happen to notice any IPv6 or LLDP or other link-local traffic on the interfaces? (including backup interface). The MTR loss % is purely a capture of their packets xmitted and responses received, so for that UDP MTR test, this is saying that UDP packets were lost, somewhere. The NIC does not have any drops showing via ethtool -S stats but I'm hunting down which are the right pair of before/afters. Other than the tpa_abort counts, there were no errors that I saw. I can't tell what the tpa_abort means for the frame - is it purely a failure only to coalesce, or does it end up dropping packets at some point in that functionality? I'm assuming not, as whatever the reason, those would be counted as drops, I hope, and printed in the interface stats. I'll attach all the stats here once I get them sorted out, I thought I had a clean diff of before and after from the tester, but after looking through, I don't think the file I have is from before/after the mtr test, as there was negligible UDP traffic. I'll try and get clarification from the reporter. Note that when the provision of primary= is used to configure which interface is primary, and when the primary port is used as the active interface for the bond, no problems are seen (and that works deterministically to set the correct active interface). -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error.
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Additional observations. MAAS is being used to deploy the system and configure the bond interface and settings. MAAS allows you to specify which is the primary interface, with the other being the backup, for the active-backup bonding mode. However, it does not appear to be working -it's not passing along a primary primitive, for instance, in the netplan yaml or otherwise resulting in this being honored (still need to confirm). MAAS allows you to enter a mac address for the bond interface, but if not supplied, by default it will use the mac address of the "primary" interface, as configured. MAAS then populates the /etc/netplan/50-cloud-init.yaml, including a macaddr= line with the default. netplan then passes that along to systemd-networkd. The bonding kernel, however, will use as the active interface whichever interface is first attached to the bond (i.e., which completes getting attached to the bond interface first) in the absence of a primary= directive. The bonding kernel will, however, use the mac addr supplied as an override. So let's say the active interface was configured in MAAS to be f0, and it's mac is used to be the mac address of the bond, but f1 (the second port of the NIC) actually gets attached first to the bond and is used as the active interface by the bond. We have a situation where f0 = backup, f1 = active, and bond0 is using the mac of f0. While this should work, there is a potential for problems depending on the circumstances. It's likely this has nothing to do with our current issue, but here for completeness. Will see if we can test/confirm. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
ethtool-enp94s0f0 -- Settings for enp94s0f0: Supported ports: [ FIBRE ] Supported link modes: 1baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: 1Mb/s Duplex: Full Port: FIBRE PHYAD: 1 Transceiver: internal Auto-negotiation: off Supports Wake-on: g Wake-on: d Current message level: 0x (0) Link detected: yes ethtool-i-enp94s0f0 -- driver: bnxt_en version: 1.10.0 firmware-version: 214.0.253.1/pkg 21.40.25.31 expansion-rom-version: bus-info: :5e:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: no supports-priv-flags: no ethtool-c-enp94s0f0 - Coalesce parameters for enp94s0f0: Adaptive RX: off TX: off stats-block-usecs: 100 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 10 rx-frames: 15 rx-usecs-irq: 1 rx-frames-irq: 1 tx-usecs: 28 tx-frames: 30 tx-usecs-irq: 2 tx-frames-irq: 2 rx-usecs-low: 0 rx-frame-low: 0 tx-usecs-low: 0 tx-frame-low: 0 rx-usecs-high: 0 rx-frame-high: 0 tx-usecs-high: 0 tx-frame-high: 0 ethtool-g-enp94s0f0 Ring parameters for enp94s0f0: Pre-set maximums: RX: 2047 RX Mini:0 RX Jumbo: 8191 TX: 2047 Current hardware settings: RX: 511 RX Mini:0 RX Jumbo: 2044 TX: 511 ethtool-k-enp94s0f0 - Features for enp94s0f0: rx-checksumming: on tx-checksumming: on tx-checksum-ipv4: on tx-checksum-ip-generic: off [fixed] tx-checksum-ipv6: on tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: off [fixed] tx-tcp-mangleid-segmentation: off tx-tcp6-segmentation: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: on receive-hashing: on highdma: on [fixed] rx-vlan-filter: off [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: on tx-gre-csum-segmentation: on tx-ipxip4-segmentation: on tx-ipxip6-segmentation: off [fixed] tx-udp_tnl-segmentation: on tx-udp_tnl-csum-segmentation: on tx-gso-partial: on tx-sctp-segmentation: off [fixed] tx-esp-segmentation: off [fixed] tx-udp-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: on rx-vlan-stag-hw-parse: on rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] hw-tc-offload: on esp-hw-offload: off [fixed] esp-tx-csum-hw-offload: off [fixed] rx-udp_tunnel-port-offload: on tls-hw-tx-offload: off [fixed] tls-hw-rx-offload: off [fixed] rx-gro-hw: on tls-hw-record: off [fixed] -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
** Attachment added: "ethtool -S for inactive interface enp94s0f0" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853638/+attachment/5327556/+files/ethtool-S-enp94s0f0 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Edwin, let me know if you can get in touch with me via the contact email on my Launchpad page. Thanks for all the help! -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try determine the problem:
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
"Bad" System/NIC: NIC: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller System: Dell Kernel: 5.3.0-28-generic #30~18.04.1-Ubuntu (Note, this issue has been seen on prior kernels as well, upgraded to latest to see if various problems were resolved) Attaching stats/config files from nics from this system (seeing issue). -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Good System/Good NIC (all configurations work) Comparison NIC: NetXtreme II BCM57000 10 Gigabit Ethernet QLogic 57000 System: Dell Kernel: 5.0.0-25-generic #26~18.04.1-Ubuntu /proc/net/bonding/bond0 --- Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: enp5s0f1 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: enp5s0f1 MII Status: up Speed: 1 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:00:00:00:73:e2 Slave queue ID: 0 Slave Interface: enp5s0f0 MII Status: up Speed: 1 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:00:00:00:73:e0 Slave queue ID: 0 /etc/netplan/50-cloud-init.yaml network: bonds: bond0: addresses: - 00.00.235.182/25 gateway4: 00.00.235.129 interfaces: - enp5s0f0 - enp5s0f1 macaddress: 00:00:00:00:73:e0 mtu: 9000 nameservers: addresses: - 00.00.235.172 - 00.00.235.171 search: - maas parameters: down-delay: 0 gratuitious-arp: 1 mii-monitor-interval: 100 mode: active-backup transmit-hash-policy: layer2 up-delay: 0 ethernets: ...(snip).. enp5s0f0: match: macaddress: 00:00:00:00:73:e0 mtu: 9000 set-name: enp5s0f0 enp5s0f1: match: macaddress: 00:00:00:00:73:e2 mtu: 9000 set-name: enp5s0f1 version: 2 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error.
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
"Bad" Configuration for active-backup mode: $ cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: enp94s0f1d1 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Peer Notification Delay (ms): 0 Slave Interface: enp94s0f1d1 MII Status: up Speed: 1 Mbps Duplex: full Link Failure Count: 2 Permanent HW addr: 4c:d9:8f:48:08:da Slave queue ID: 0 Slave Interface: enp94s0f0 MII Status: up Speed: 1 Mbps Duplex: full Link Failure Count: 2 Permanent HW addr: 4c:d9:8f:48:08:d9 Slave queue ID: 0 --- $ cat uname-rv 5.3.0-28-generic #30~18.04.1-Ubuntu SMP Fri Jan 17 06:14:09 UTC 2020 --- Scrubbed /etc/netplan/50-cloud-init.yaml: network: bonds: bond0: addresses: - 0.0.235.177/25 gateway4: 0.0.235.129 interfaces: - enp94s0f0 - enp94s0f1d1 macaddress: 00:00:00:48:08:00 mtu: 9000 nameservers: addresses: - 0.0.235.171 - 0.0.235.172 search: - maas parameters: down-delay: 0 gratuitious-arp: 1 mii-monitor-interval: 100 mode: active-backup transmit-hash-policy: layer2 up-delay: 0 ethernets: eno1: match: macaddress: 00:00:00:76:6e:ca mtu: 1500 set-name: eno1 eno2: match: macaddress: 00:00:00:76:6e:cb mtu: 1500 set-name: eno2 enp94s0f0: match: macaddress: 00:00:00:48:08:00 mtu: 9000 set-name: enp94s0f0 enp94s0f1d1: match: macaddress: 00:00:00:48:08:da mtu: 9000 set-name: enp94s0f1d1 version: 2 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
We have narrowed it down to a flaw in a specific configuration setting on this NIC, so we're comparing the good and bad configurations now. Primary port: enp94s0f0 Secondary port: enp94s0f1d1 A] Good config for fault-tolerance (active-backup) bonding mode: -- Primary port = active interface; Secondary port = backup B] Bad config for fault-tolerance (active-backup) bonding mode: -- Primary port = backup interface; Secondary port = active We are consistently able to reproduce a drop rate difference with UDP pkts, for the above good/bad cases: Good Case UDP MTR Test Result - mtr --no-dns --report --report-cycles 60 --udp -s 1428 $DEST Start: 2020-02-10T10:14:01+ HOST: hostname Loss% Snt Last Avg Best Wrst StDev 1.|-- nn.nn.nnn.nnn 0.0%600.3 0.2 0.2 0.3 0.0 Bad Case UDP MTR Test Result --- mtr --no-dns --report --report-cycles 60 --udp -s 1428 $DEST Start: 2020-02-10T14:10:52+ HOST: hostname Loss% Snt Last Avg Best Wrst StDev 1.|-- nn.nn.nnn.nnn 8.3%600.3 0.3 0.2 0.4 0.0 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error.
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
The second port on the NIC definitely works as the active interface in an active-backup bonding configuration on the other NICs. At the moment, it's only this particular NIC that is seeing this problem that we know of. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Hello Edwin, Here is more information on the issue we are seeing wrt dropped packets and other connectivity issues with this NIC. The problem is *only* seen when the second port on the NIC is chosen as the active interface of a active-backup configuration. So on the "bad" system with the interfaces: enp94s0f0 -> when chosen as active, all OK enp94s0f1d1 -> when chosen as active, not OK I'll see if the reporters can confirm that on the "good" systems, there was no problem when the second interface is active. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Hey Edwin, sorry, I didn't see your last question. I'll try and confirm but I've seen loss in both directions but it's not clear whether that's significant enough or not yet. e.g., TCP traffic is retransmitted, so it could be segments lost while outgoing or acks lost incoming. 4407 retransmitted TCP segments 130 TCP timeouts in stats collected about 5 mins apart - which isn't sufficient a sample size, we're trying to get a new collection of stats, logs using the netperf TCP_RR test. In our case, note, we're more concerned (and have more solid data) of latency issues than dropped packets (which I expect some of with heavy network testing). For example, netperf TCP_RR latency is about 70-78% of the older systems for 1,1 request/response byte sizes as well as 64/64, 100/200, 128/8192 sizes. I'll update here as soon as we have more data from the production environment. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
> NICs between systems? Are OS / kernel and driver > versions the same on both systems? Yes, identical distro release, kernel, and most of the software stack (I have not obtained and examined the full sw stack). Configuration of networking settings is also the same. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Thanks very much for helping on this, Edwin! Please let me know if there's anything specific you need. I'm asking them to disable any IPv6, LLDP traffic in their environment, and retest and collect information again. Also, I'd like to disable tpa, would this be at all useful: modprobe bnx disable_tpa=1 ?? -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
> There are more than one variable at play here. > Does the problem follow the NIC if you swap the > NICs between systems? Are OS / kernel and driver > versions the same on both systems? Unfortunately, I've not been able to get them to try permutations or switches, as yet, as this is still a production system/environment. I'll try and obtain more information about it. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
> The mtr packet loss is an interesting result. What mtr options did you use? Is this a UDP or ICMP test? The mtr command was: mtr --no-dns --report --report-cycles 60 $IP_ADDR so ICMP was going out. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
** Attachment added: "backup interface ethtool-S" https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1853638/+attachment/5324071/+files/ethtool-S-enp94s0f1d1 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
** Attachment added: "active interface ethtool-S" https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1853638/+attachment/5324070/+files/ethtool-S-enp94s0f0 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the USRPs themselves, as we have tested them with normal 1G network cards and have no dropped samples. Personally I think its something to do with the 10G network card, possibly on a ubuntu driver??? Note, Dell have said there is no hardware problem with the 10G interfaces I have followed the troubleshooting information on this link to try determine
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Note that iperf was identical whereas netperf and mtr showed up differences (so it's possibly sporadic as well, not continuous) 1. iperf tcp test -- GoodSystem.9.84 Gbits/sec BadSystem18.37 Gbits/sec BadSystem2...9.85 Gbits/sec 2. iperf udp test -- GoodSystem.1.05 Mbits/sec BadSystem2...1.05 Mbits/sec 3. mtr ping test --- GoodSystem..0.0% Loss; 0.2 Avg; 0.1 Best, 0.9 Worst, 0.1 StdDev BadSystem2...11.7% Loss; 0.1 Avg; 0.1 Best, 0.2 Worst, 0.0 StdDev 4. netperf tcp_rr 1/1 bytes GoodSystem..17921.83 t/sec BadSystem1.13912.45 t/sec BadSystem2 5. netperf tcp_rr 64/64 bytes GoodSystem..16987.48 t/sec BadSystem1.13355.93 t/sec BadSystem2 6. netperf tcp_rr 128/8192 bytes --- GoodSystem..2396.45 t/sec BadSystem1.1678.54 t/sec BadSystem2 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
Hello, Edwin, We have two separate users/customers filing reports, and I can answer for one of them. I'll ask the original poster separately as well to reply. With respect to one of these situations, this is the following system: Dell PowerEdge R440/0XP8V5, BIOS 2.2.11 06/14/2019 Note that a similar system does not have any issues: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.3.4 11/08/2016 So the NIC in the "bad" environment is: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01) Product Name: Broadcom Adv. Dual 10G SFP+ Ethernet The NIC in the "good" environment is: Broadcom Inc. and subsidiaries NetXtreme II BCM57810 10 Gigabit Ethernet [14e4:1006] Product Name: QLogic 57810 10 Gigabit Ethernet I'll have to scrub some files and see what I can attach, apologies, I'll have it here by tmrw. Unfortunately, we don't have an easy reproducer. A single iperf and netperf test (both UDP and TCP) showed identical results from both "good" and "bad" environments. What we have is an identical kernel, network configuration and stack with the "bad" system showing double, triple the latency to the systems from a remote server. I'll have more information for you shortly here regarding the exact k8 cmd. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error.
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
(active interface) > cat ethtool-S-enp94s0f1d1 | grep abort [0]: tpa_aborts: 19775497 [1]: tpa_aborts: 26758635 [2]: tpa_aborts: 12008147 [3]: tpa_aborts: 15829167 [4]: tpa_aborts: 25099500 [5]: tpa_aborts: 3292554 [6]: tpa_aborts: 2863692 [7]: tpa_aborts: 20224692 (backup interface) > cat ethtool-S-enp94s0f0 | grep abort [0]: tpa_aborts: 3158584 [1]: tpa_aborts: 1670319 [2]: tpa_aborts: 1749371 [3]: tpa_aborts: 1454301 [4]: tpa_aborts: 123020 [5]: tpa_aborts: 1403509 [6]: tpa_aborts: 1298383 [7]: tpa_aborts: 1858753 Netted out from previous capture, there were *f0 = 2014 tpa_aborts *d1 = 1118473 tpa_aborts ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed ** Changed in: linux (Ubuntu) Importance: Undecided => Critical -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
We suspect this is a device (hw/fw) issue, however, not NetworkManager or kernel (driver bnxt_en). I've added the kernel for the driver impact (just in case, for now). This is really to eliminate all other causes and confirm whether it's the device at root cause). NIC Product Name: Broadcom Adv. Dual 10G SFP+ Ethernet 5e:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01) NIC Driver/FW --- driver: bnxt_en version: 1.10.0 firmware-version: 214.0.253.1/pkg 21.40.25.31 expansion-rom-version: bus-info: :5e:00.1 supports-statistics: yes Kernel - 5.0.0-37-generic #40~18.04.1-Ubuntu SMP Thu Nov 14 12:06:39 UTC 2019 (appears to be an issue on all kernel versions) Environment Configuration - active-backup bonding mode (having the active backup up *might* potentially be the problem, but it might just be the device itself). The exact same distro, kernel, applications and configuration works fine with a different NIC (Broadcom 10g bnx2x). There were quite a few total tpa_abort stats counts (1118473) during the duration of a 2 minute iperf test. Hoping to get more information from other users seeing the same issue. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: Confirmed Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209]
[Touch-packages] [Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data
I have reports of the same device appearing to drop packets and incur greater number of retransmissions under certain circumstances which we're still trying to nail down. I'm using this bug for now until proven to be a different problem. This is causing issues in a production environment. ** Changed in: network-manager (Ubuntu) Status: New => Confirmed ** Changed in: network-manager (Ubuntu) Importance: Undecided => Critical ** Tags added: sts ** Also affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Status in linux package in Ubuntu: New Status in network-manager package in Ubuntu: Confirmed Bug description: The issue appears to be with the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data Basically, we are dropping data, as you can see from the benchmark tool as follows: tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ ./benchmark_rate --rx_rate 10e6 --tx_rate 10e6 --duration 300 [INFO] [UHD] linux; GNU C++ version 5.4.0 20160609; Boost_105800; UHD_3.14.1.1-0-g98c7c986 [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:00.07] Creating the usrp device with: ... [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 200 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1308 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1316 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD1001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0) Using Device: Single USRP: Device: X-Series Device Mboard 0: X310 RX Channel: 0 RX DSP: 0 RX Dboard: A RX Subdev: SBX-120 RX RX Channel: 1 RX DSP: 0 RX Dboard: B RX Subdev: SBX-120 RX TX Channel: 0 TX DSP: 0 TX Dboard: A TX Subdev: SBX-120 TX TX Channel: 1 TX DSP: 0 TX Dboard: B TX Subdev: SBX-120 TX [00:00:04.305374] Setting device timestamp to 0... [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.310990] Testing receive rate 10.00 Msps on 1 channels [WARNING] [UHD] Unable to set the thread priority. Performance may be negatively affected. Please see the general application notes in the manual for instructions. EnvironmentError: OSError: error in pthread_setschedparam [00:00:04.318356] Testing transmit rate 10.00 Msps on 1 channels [00:00:06.693119] Detected Rx sequence error. D[00:00:09.402843] Detected Rx sequence error. DD[00:00:40.927978] Detected Rx sequence error. D[00:01:44.982243] Detected Rx sequence error. D[00:02:11.400692] Detected Rx sequence error. D[00:02:14.805292] Detected Rx sequence error. D[00:02:41.875596] Detected Rx sequence error. D[00:03:06.927743] Detected Rx sequence error. D[00:03:47.967891] Detected Rx sequence error. D[00:03:58.233659] Detected Rx sequence error. D[00:03:58.876588] Detected Rx sequence error. D[00:04:03.139770] Detected Rx sequence error. D[00:04:45.287465] Detected Rx sequence error. D[00:04:56.425845] Detected Rx sequence error. D[00:04:57.929209] Detected Rx sequence error. [00:05:04.529548] Benchmark complete. Benchmark rate summary: Num received samples: 2995435936 Num dropped samples: 4622800 Num overruns detected:0 Num transmitted samples: 3008276544 Num sequence errors (Tx): 0 Num sequence errors (Rx): 15 Num underruns detected: 0 Num late commands:0 Num timeouts (Tx):0 Num timeouts (Rx):0 Done! tcdforge@x310a:/usr/local/lib/lib/uhd/examples$ In this particular case description, the nodes are USRP x310s. However, we have the same issue with N210 nodes dropping samples connected to the BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device. There is no problem with the
[Touch-packages] [Bug 1666203] Re: pam_tty_audit failed in pam_open_session
** Changed in: pam (Ubuntu Xenial) Importance: Undecided => High ** Changed in: pam (Ubuntu Bionic) Importance: Undecided => High -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to pam in Ubuntu. https://bugs.launchpad.net/bugs/1666203 Title: pam_tty_audit failed in pam_open_session Status in pam package in Ubuntu: Fix Released Status in pam source package in Xenial: Confirmed Status in pam source package in Bionic: Confirmed Status in pam package in Debian: Fix Released Bug description: Dear Maintainer. I found a bug in pam_tty_audit. When Using the pam_tty_audit with other pam modules(ex, pam_ldap), it failed in pam_open_session. It was triggared by use uninitialized variable in pam_tty_audit.c::pam_open_session. * Enviroments Ubuntu 14.04.4 LTS linux-image-3.16.0-71-generic3.16.0-71.92~14.04.1 libpam-ldap:amd64184-8.5ubuntu3 libpam-modules:amd641.1.8-1ubuntu2.2 Ubuntu 16.04.2 TLS linux-image-4.4.0-62-generic4.4.0-62.83 libpam-ldap:amd64184-8.7ubuntu1 libpam-modules:amd641.1.8-3.2ubuntu2 * Reproduction method 1. Install libpam-ldap. 2. Add the following to the end of /etc/pam.d/common-sessions session required pam_tty_audit.so enable=* open_only 3. When logging in with ssh etc., pam_tty_audit will fail and login fails * Solution (== 2018/04/16 Link updated ==) apply upstream patch https://github.com/linux-pam/linux-pam/commit/c5f829931a22c65feffee16570efdae036524bee * Logs (on Ubuntu14.04) -- auth.log -- May 18 14:47:03 vm sshd[2272]: Accepted publickey for test from 10.99.0.1 port 51398 ssh2: RSA 8f:39:1c:3a:f4:9d:ca:99:67:fc:e3:fd:1e:0c:5b:a8 May 18 14:47:03 vm sshd[2272]: pam_unix(sshd:session): session opened for user test by (uid=0) May 18 14:47:03 vm sshd[2272]: pam_tty_audit(sshd:session): error setting current audit status: Invalid argument May 18 14:47:03 vm sshd[2272]: error: PAM: pam_open_session(): Cannot make/remove an entry for the specified session May 18 14:47:03 vm sshd[2297]: Received disconnect from 10.99.0.1: 11: disconnected by user -- syslog -- May 18 14:47:03 vm audispd: node=vm type=USER_ACCT msg=audit(1463550423.399:58): pid=2272 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting acct="test" exe="/usr/sbin/sshd" hostname=10.99.0.1 addr=10.99.0.1 terminal=ssh res=success' May 18 14:47:03 vm audispd: node=vm type=CRED_ACQ msg=audit(1463550423.403:59): pid=2272 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred acct="test" exe="/usr/sbin/sshd" hostname=10.99.0.1 addr=10.99.0.1 terminal=ssh res=success' May 18 14:47:03 vm audispd: node=vm type=LOGIN msg=audit(1463550423.403:60): pid=2272 uid=0 old-auid=4294967295 auid=20299 old-ses=4294967295 ses=3 res=1 May 18 14:47:03 vm audispd: node=vm type=CONFIG_CHANGE msg=audit(1463550423.403:61): pid=2272 uid=0 auid=20299 ses=3 op=tty_set old-enabled=0 new-enabled=1 old-log_passwd=0 new-log_passwd=32743 res=0 May 18 14:47:03 vm audispd: node=vm type=USER_START msg=audit(1463550423.447:62): pid=2272 uid=0 auid=20299 ses=3 msg='op=PAM:session_open acct="test" exe="/usr/sbin/sshd" hostname=10.99.0.1 addr=10.99.0.1 terminal=ssh res=failed' May 18 14:47:03 vm audispd: node=vm type=CRED_ACQ msg=audit(1463550423.447:63): pid=2297 uid=0 auid=20299 ses=3 msg='op=PAM:setcred acct="test" exe="/usr/sbin/sshd" hostname=10.99.0.1 addr=10.99.0.1 terminal=ssh res=success' May 18 14:47:03 vm audispd: node=vm type=CRED_DISP msg=audit(1463550423.451:64): pid=2272 uid=0 auid=20299 ses=3 msg='op=PAM:setcred acct="test" exe="/usr/sbin/sshd" hostname=10.99.0.1 addr=10.99.0.1 terminal=ssh res=success' Thanks regards. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/pam/+bug/1666203/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1748147] Re: [SRU] debhelper support override from /etc/tmpfiles.d for systemd
Waiting on Xenial update for this... -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1748147 Title: [SRU] debhelper support override from /etc/tmpfiles.d for systemd Status in debhelper: Fix Released Status in debhelper package in Ubuntu: Fix Released Status in rsyslog package in Ubuntu: Invalid Status in systemd package in Ubuntu: Fix Committed Status in debhelper source package in Xenial: Won't Fix Status in rsyslog source package in Xenial: Invalid Status in systemd source package in Xenial: Confirmed Status in debhelper source package in Artful: Won't Fix Status in rsyslog source package in Artful: Invalid Status in systemd source package in Artful: Confirmed Status in debhelper source package in Bionic: Won't Fix Status in rsyslog source package in Bionic: Invalid Status in systemd source package in Bionic: Fix Committed Bug description: [Impact] /var/log's Permission is going back to 755 after upgrading systemd if there are rsyslog's configuration on /var/lib/tmpfiles.d/ Affected X, A, B, C This is because rsyslog's pkg has 00rsyslog.conf and copied it on /var/lib/tmpfiles.d/ when it is installing. after upgrading systemd, systemd only refresh it's own tmpfiles so disappear conf for 00rsyslog.conf ( it doesn't remove file itself ) so, systemd-tmpfiles --create /var/lib/tmpfiles.d/00rsyslog.conf back permission to 775 [Test Case] 1. deploy 16.04 vm 2. check ll /var (775) 3. apt install --reinstall systemd 4. check ll /var (755) [Regression Potential] This fix changes debhelper's override process by using absolute path to filename. so if the other pkgs using debhelper e.g systemd are there, It should be re-build with new debhelper after patching in theory, now only systemd is affected. but building is not affected. also, pkg like rsyslog which is using systemd's tmpfile system need to be changed to use /etc/tmpfiles.d/[SAME_FILENAME_IN_VAR_LIB_TMPFILES.D_FOR_OVERRIDING] instead of 00rsyslog.conf. [Others] For this issue, need to fix below pkgs debhelper systemd ( rebuilding with new debhelper is needed ) rsyslog ( 00rsyslog.conf to var.conf and location should be /etc/tmpfiles.d, to support override supported by debhelper ) [Original description] Upgrading or reinstalling the systemd package when using rsyslogd results in bad permissions (0755 instead of 0775) being set on /var/log/. As a consequence of this, rsyslogd can no longer create new files within this directory, resulting in lost log messages. The default configuration of rsyslogd provided by Ubuntu runs the daemon as syslog:syslog and sets ownership of /var/log to syslog:adm with mode 0775. Systemd's default tmpfiles configuration sets /var/log to 0755 in /usr/lib/tmpfiles.d/var.conf, however this is overridden in /usr/lib/tmpfiles.d/00rsyslog.conf which is provided by package rsyslog. It looks as though an upgrade of the systemd package fails to take /usr/lib/tmpfiles.d/00rsyslog.conf into account, as demonstrated below. This results in /var/log receiving mode 0755 instead of the expected 0775: nick @ log2.be1.ams1:~ $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.3 LTS Release: 16.04 Codename: xenial nick @ log2.be1.ams1:~ $ apt policy systemd systemd: Installed: 229-4ubuntu21.1 Candidate: 229-4ubuntu21.1 Version table: *** 229-4ubuntu21.1 500 500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages 500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages 100 /var/lib/dpkg/status 229-4ubuntu4 500 500 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages nick @ log2.be1.ams1:~ $ apt policy rsyslog rsyslog: Installed: 8.16.0-1ubuntu3 Candidate: 8.16.0-1ubuntu3 Version table: *** 8.16.0-1ubuntu3 500 500 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages 100 /var/lib/dpkg/status nick @ log2.be1.ams1:~ $ grep -F /var/log /usr/lib/tmpfiles.d/var.conf d /var/log 0755 - - - f /var/log/wtmp 0664 root utmp - f /var/log/btmp 0600 root utmp - nick @ log2.be1.ams1:~ $ cat /usr/lib/tmpfiles.d/00rsyslog.conf # Override systemd's default tmpfiles.d/var.conf to make /var/log writable by # the syslog group, so that rsyslog can run as user. # See tmpfiles.d(5) for details. # Type PathMode UID GID Age Argument d /var/log 0775 root syslog - nick @ log2.be1.ams1:~ $ ls -ld /var/log drwxrwxr-x 8 root syslog 4096 Feb 7 13:45 /var/log nick @ log2.be1.ams1:~ $ sudo apt install --reinstall systemd Reading package lists... Done Building dependency tree Reading state information... Done 0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and
[Touch-packages] [Bug 1748147] Re: [SRU] debhelper support override from /etc/tmpfiles.d for systemd
** Tags removed: verification-needed-bionic ** Tags added: verification-done-bionic -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1748147 Title: [SRU] debhelper support override from /etc/tmpfiles.d for systemd Status in debhelper: Fix Released Status in debhelper package in Ubuntu: Fix Released Status in rsyslog package in Ubuntu: Invalid Status in systemd package in Ubuntu: Fix Committed Status in debhelper source package in Xenial: Won't Fix Status in rsyslog source package in Xenial: Invalid Status in systemd source package in Xenial: Confirmed Status in debhelper source package in Artful: Won't Fix Status in rsyslog source package in Artful: Invalid Status in systemd source package in Artful: Confirmed Status in debhelper source package in Bionic: Won't Fix Status in rsyslog source package in Bionic: Invalid Status in systemd source package in Bionic: Fix Committed Bug description: [Impact] /var/log's Permission is going back to 755 after upgrading systemd if there are rsyslog's configuration on /var/lib/tmpfiles.d/ Affected X, A, B, C This is because rsyslog's pkg has 00rsyslog.conf and copied it on /var/lib/tmpfiles.d/ when it is installing. after upgrading systemd, systemd only refresh it's own tmpfiles so disappear conf for 00rsyslog.conf ( it doesn't remove file itself ) so, systemd-tmpfiles --create /var/lib/tmpfiles.d/00rsyslog.conf back permission to 775 [Test Case] 1. deploy 16.04 vm 2. check ll /var (775) 3. apt install --reinstall systemd 4. check ll /var (755) [Regression Potential] This fix changes debhelper's override process by using absolute path to filename. so if the other pkgs using debhelper e.g systemd are there, It should be re-build with new debhelper after patching in theory, now only systemd is affected. but building is not affected. also, pkg like rsyslog which is using systemd's tmpfile system need to be changed to use /etc/tmpfiles.d/[SAME_FILENAME_IN_VAR_LIB_TMPFILES.D_FOR_OVERRIDING] instead of 00rsyslog.conf. [Others] For this issue, need to fix below pkgs debhelper systemd ( rebuilding with new debhelper is needed ) rsyslog ( 00rsyslog.conf to var.conf and location should be /etc/tmpfiles.d, to support override supported by debhelper ) [Original description] Upgrading or reinstalling the systemd package when using rsyslogd results in bad permissions (0755 instead of 0775) being set on /var/log/. As a consequence of this, rsyslogd can no longer create new files within this directory, resulting in lost log messages. The default configuration of rsyslogd provided by Ubuntu runs the daemon as syslog:syslog and sets ownership of /var/log to syslog:adm with mode 0775. Systemd's default tmpfiles configuration sets /var/log to 0755 in /usr/lib/tmpfiles.d/var.conf, however this is overridden in /usr/lib/tmpfiles.d/00rsyslog.conf which is provided by package rsyslog. It looks as though an upgrade of the systemd package fails to take /usr/lib/tmpfiles.d/00rsyslog.conf into account, as demonstrated below. This results in /var/log receiving mode 0755 instead of the expected 0775: nick @ log2.be1.ams1:~ $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.3 LTS Release: 16.04 Codename: xenial nick @ log2.be1.ams1:~ $ apt policy systemd systemd: Installed: 229-4ubuntu21.1 Candidate: 229-4ubuntu21.1 Version table: *** 229-4ubuntu21.1 500 500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages 500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages 100 /var/lib/dpkg/status 229-4ubuntu4 500 500 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages nick @ log2.be1.ams1:~ $ apt policy rsyslog rsyslog: Installed: 8.16.0-1ubuntu3 Candidate: 8.16.0-1ubuntu3 Version table: *** 8.16.0-1ubuntu3 500 500 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages 100 /var/lib/dpkg/status nick @ log2.be1.ams1:~ $ grep -F /var/log /usr/lib/tmpfiles.d/var.conf d /var/log 0755 - - - f /var/log/wtmp 0664 root utmp - f /var/log/btmp 0600 root utmp - nick @ log2.be1.ams1:~ $ cat /usr/lib/tmpfiles.d/00rsyslog.conf # Override systemd's default tmpfiles.d/var.conf to make /var/log writable by # the syslog group, so that rsyslog can run as user. # See tmpfiles.d(5) for details. # Type PathMode UID GID Age Argument d /var/log 0775 root syslog - nick @ log2.be1.ams1:~ $ ls -ld /var/log drwxrwxr-x 8 root syslog 4096 Feb 7 13:45 /var/log nick @ log2.be1.ams1:~ $ sudo apt install --reinstall systemd Reading package lists... Done Building dependency tree Reading state information... Done 0 upgraded, 0
[Touch-packages] [Bug 1748147] Re: [SRU] debhelper support override from /etc/tmpfiles.d for systemd
Tested, works. Repro (Xenial): # dpkg -l | grep systemd ii systemd 229-4ubuntu21.2 amd64system and service manager /var# ll drwxrwxr-x 8 root syslog 4096 Jul 9 06:25 log/ <--775 # apt install --reinstall systemd /var# ll drwxr-xr-x 8 root syslog 4096 Jul 9 06:25 log/ <-- 755 Bionic (Verified): # dpkg -l | grep systemd ii systemd 237-3ubuntu10.2 amd64system and service manager /var# ll drwxrwxr-x 8 root syslog 4096 Jul 9 13:09 log/<-- 775 # apt install --reinstall systemd 0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 0 not upgraded. Need to get 2895 kB of archives. After this operation, 0 B of additional disk space will be used. Get:1 http://archive.ubuntu.com/ubuntu bionic-proposed/main amd64 systemd amd64 237-3ubuntu10.2 [2895 kB] ... /var# ll drwxrwxr-x 8 root syslog 4096 Jul 9 13:09 log/ <-- 775 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1748147 Title: [SRU] debhelper support override from /etc/tmpfiles.d for systemd Status in debhelper: Fix Released Status in debhelper package in Ubuntu: Fix Released Status in rsyslog package in Ubuntu: Invalid Status in systemd package in Ubuntu: Fix Committed Status in debhelper source package in Xenial: Won't Fix Status in rsyslog source package in Xenial: Invalid Status in systemd source package in Xenial: Confirmed Status in debhelper source package in Artful: Won't Fix Status in rsyslog source package in Artful: Invalid Status in systemd source package in Artful: Confirmed Status in debhelper source package in Bionic: Won't Fix Status in rsyslog source package in Bionic: Invalid Status in systemd source package in Bionic: Fix Committed Bug description: [Impact] /var/log's Permission is going back to 755 after upgrading systemd if there are rsyslog's configuration on /var/lib/tmpfiles.d/ Affected X, A, B, C This is because rsyslog's pkg has 00rsyslog.conf and copied it on /var/lib/tmpfiles.d/ when it is installing. after upgrading systemd, systemd only refresh it's own tmpfiles so disappear conf for 00rsyslog.conf ( it doesn't remove file itself ) so, systemd-tmpfiles --create /var/lib/tmpfiles.d/00rsyslog.conf back permission to 775 [Test Case] 1. deploy 16.04 vm 2. check ll /var (775) 3. apt install --reinstall systemd 4. check ll /var (755) [Regression Potential] This fix changes debhelper's override process by using absolute path to filename. so if the other pkgs using debhelper e.g systemd are there, It should be re-build with new debhelper after patching in theory, now only systemd is affected. but building is not affected. also, pkg like rsyslog which is using systemd's tmpfile system need to be changed to use /etc/tmpfiles.d/[SAME_FILENAME_IN_VAR_LIB_TMPFILES.D_FOR_OVERRIDING] instead of 00rsyslog.conf. [Others] For this issue, need to fix below pkgs debhelper systemd ( rebuilding with new debhelper is needed ) rsyslog ( 00rsyslog.conf to var.conf and location should be /etc/tmpfiles.d, to support override supported by debhelper ) [Original description] Upgrading or reinstalling the systemd package when using rsyslogd results in bad permissions (0755 instead of 0775) being set on /var/log/. As a consequence of this, rsyslogd can no longer create new files within this directory, resulting in lost log messages. The default configuration of rsyslogd provided by Ubuntu runs the daemon as syslog:syslog and sets ownership of /var/log to syslog:adm with mode 0775. Systemd's default tmpfiles configuration sets /var/log to 0755 in /usr/lib/tmpfiles.d/var.conf, however this is overridden in /usr/lib/tmpfiles.d/00rsyslog.conf which is provided by package rsyslog. It looks as though an upgrade of the systemd package fails to take /usr/lib/tmpfiles.d/00rsyslog.conf into account, as demonstrated below. This results in /var/log receiving mode 0755 instead of the expected 0775: nick @ log2.be1.ams1:~ $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.3 LTS Release: 16.04 Codename: xenial nick @ log2.be1.ams1:~ $ apt policy systemd systemd: Installed: 229-4ubuntu21.1 Candidate: 229-4ubuntu21.1 Version table: *** 229-4ubuntu21.1 500 500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages 500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages 100 /var/lib/dpkg/status 229-4ubuntu4 500 500 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages nick @ log2.be1.ams1:~ $ apt policy rsyslog rsyslog: Installed: 8.16.0-1ubuntu3 Candidate: 8.16.0-1ubuntu3 Version
[Touch-packages] [Bug 1748147] Re: [SRU] debhelper support override from /etc/tmpfiles.d for systemd
I will be testing the updated systemd and will update the bug here once validated. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1748147 Title: [SRU] debhelper support override from /etc/tmpfiles.d for systemd Status in debhelper: Fix Released Status in debhelper package in Ubuntu: Fix Released Status in rsyslog package in Ubuntu: Invalid Status in systemd package in Ubuntu: Fix Committed Status in debhelper source package in Xenial: Won't Fix Status in rsyslog source package in Xenial: Invalid Status in systemd source package in Xenial: Confirmed Status in debhelper source package in Artful: Won't Fix Status in rsyslog source package in Artful: Invalid Status in systemd source package in Artful: Confirmed Status in debhelper source package in Bionic: Won't Fix Status in rsyslog source package in Bionic: Invalid Status in systemd source package in Bionic: Fix Committed Bug description: [Impact] /var/log's Permission is going back to 755 after upgrading systemd if there are rsyslog's configuration on /var/lib/tmpfiles.d/ Affected X, A, B, C This is because rsyslog's pkg has 00rsyslog.conf and copied it on /var/lib/tmpfiles.d/ when it is installing. after upgrading systemd, systemd only refresh it's own tmpfiles so disappear conf for 00rsyslog.conf ( it doesn't remove file itself ) so, systemd-tmpfiles --create /var/lib/tmpfiles.d/00rsyslog.conf back permission to 775 [Test Case] 1. deploy 16.04 vm 2. check ll /var (775) 3. apt install --reinstall systemd 4. check ll /var (755) [Regression Potential] This fix changes debhelper's override process by using absolute path to filename. so if the other pkgs using debhelper e.g systemd are there, It should be re-build with new debhelper after patching in theory, now only systemd is affected. but building is not affected. also, pkg like rsyslog which is using systemd's tmpfile system need to be changed to use /etc/tmpfiles.d/[SAME_FILENAME_IN_VAR_LIB_TMPFILES.D_FOR_OVERRIDING] instead of 00rsyslog.conf. [Others] For this issue, need to fix below pkgs debhelper systemd ( rebuilding with new debhelper is needed ) rsyslog ( 00rsyslog.conf to var.conf and location should be /etc/tmpfiles.d, to support override supported by debhelper ) [Original description] Upgrading or reinstalling the systemd package when using rsyslogd results in bad permissions (0755 instead of 0775) being set on /var/log/. As a consequence of this, rsyslogd can no longer create new files within this directory, resulting in lost log messages. The default configuration of rsyslogd provided by Ubuntu runs the daemon as syslog:syslog and sets ownership of /var/log to syslog:adm with mode 0775. Systemd's default tmpfiles configuration sets /var/log to 0755 in /usr/lib/tmpfiles.d/var.conf, however this is overridden in /usr/lib/tmpfiles.d/00rsyslog.conf which is provided by package rsyslog. It looks as though an upgrade of the systemd package fails to take /usr/lib/tmpfiles.d/00rsyslog.conf into account, as demonstrated below. This results in /var/log receiving mode 0755 instead of the expected 0775: nick @ log2.be1.ams1:~ $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.3 LTS Release: 16.04 Codename: xenial nick @ log2.be1.ams1:~ $ apt policy systemd systemd: Installed: 229-4ubuntu21.1 Candidate: 229-4ubuntu21.1 Version table: *** 229-4ubuntu21.1 500 500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages 500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages 100 /var/lib/dpkg/status 229-4ubuntu4 500 500 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages nick @ log2.be1.ams1:~ $ apt policy rsyslog rsyslog: Installed: 8.16.0-1ubuntu3 Candidate: 8.16.0-1ubuntu3 Version table: *** 8.16.0-1ubuntu3 500 500 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages 100 /var/lib/dpkg/status nick @ log2.be1.ams1:~ $ grep -F /var/log /usr/lib/tmpfiles.d/var.conf d /var/log 0755 - - - f /var/log/wtmp 0664 root utmp - f /var/log/btmp 0600 root utmp - nick @ log2.be1.ams1:~ $ cat /usr/lib/tmpfiles.d/00rsyslog.conf # Override systemd's default tmpfiles.d/var.conf to make /var/log writable by # the syslog group, so that rsyslog can run as user. # See tmpfiles.d(5) for details. # Type PathMode UID GID Age Argument d /var/log 0775 root syslog - nick @ log2.be1.ams1:~ $ ls -ld /var/log drwxrwxr-x 8 root syslog 4096 Feb 7 13:45 /var/log nick @ log2.be1.ams1:~ $ sudo apt install --reinstall systemd Reading package lists... Done Building dependency tree Reading state information... Done 0 upgraded, 0