[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-05-11 Thread Nivedita Singhvi
The issue we have reported is easily avoided by specifying the primary port to be the active interface of the bond. On netplan-using systems: Add the directive "primary: $interface" (e.g. "primary: p94s0f0") to the "parameters:" section of the netplan config file. -- You received this bug

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-05-11 Thread Nivedita Singhvi
Hello, diarmuid, Re: original issue report, were you able to resolve your issue? Please let us know. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-05-11 Thread Nivedita Singhvi
We are closing this LP bug for now as we aren't able to reproduce in-house, and we cannot get access to a live testing repro env at this time. Here is what we know: - There seems to be different performance for some tests when the NIC is configured with active-backup bonding mode, between

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-14 Thread Edwin Peer
Regarding your question about LLDP and IPv6... The default Ubuntu 18.04.3 configuration has an IPv6 enabled kernel, but the interface only has the default link local address configured. I've seen it do router solicitation on link state changes and periodically thereafter. I think I recall seeing

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-14 Thread Edwin Peer
The tpa_aborts shouldn't be a concern. They merely indicate that a TCP flow could not be aggregated. That could have a performance impact, of course, but that should manifest as counted drops somewhere if this were the case. Importantly, the tpa_aborts only apply to TCP traffic, but you see the

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-14 Thread Nivedita Singhvi
Edwin, Do you happen to notice any IPv6 or LLDP or other link-local traffic on the interfaces? (including backup interface). The MTR loss % is purely a capture of their packets xmitted and responses received, so for that UDP MTR test, this is saying that UDP packets were lost, somewhere.

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-13 Thread Edwin Peer
I have tried, unsuccessfully, to reproduce this issue internally. Details of my setup below. 1) I have a pair of Dell R210 servers racked (u072 and u073 below), each with a BCM57416 installed: root@u072:~# lspci | grep BCM57416 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-12 Thread Nivedita Singhvi
Additional observations. MAAS is being used to deploy the system and configure the bond interface and settings. MAAS allows you to specify which is the primary interface, with the other being the backup, for the active-backup bonding mode. However, it does not appear to be working -it's not

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-11 Thread Nivedita Singhvi
Edwin, let me know if you can get in touch with me via the contact email on my Launchpad page. Thanks for all the help! -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-11 Thread Nivedita Singhvi
** Attachment added: "ethtool -S for inactive interface enp94s0f0" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853638/+attachment/5327556/+files/ethtool-S-enp94s0f0 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-11 Thread Nivedita Singhvi
ethtool-enp94s0f0 -- Settings for enp94s0f0: Supported ports: [ FIBRE ] Supported link modes: 1baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-11 Thread Nivedita Singhvi
"Bad" System/NIC: NIC: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller System: Dell Kernel: 5.3.0-28-generic #30~18.04.1-Ubuntu (Note, this issue has been seen on prior kernels as well, upgraded to latest to see if various problems were resolved) Attaching stats/config files from

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-11 Thread Nivedita Singhvi
Good System/Good NIC (all configurations work) Comparison NIC: NetXtreme II BCM57000 10 Gigabit Ethernet QLogic 57000 System: Dell Kernel: 5.0.0-25-generic #26~18.04.1-Ubuntu /proc/net/bonding/bond0 --- Ethernet

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-11 Thread Nivedita Singhvi
"Bad" Configuration for active-backup mode: $ cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: enp94s0f1d1

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-11 Thread Edwin Peer
Hi Nivedita, I have been away on PTO the last week and am picking this up again now. Please could you post the full bonding configuration? Regards, Edwin Peer -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-10 Thread Nivedita Singhvi
We have narrowed it down to a flaw in a specific configuration setting on this NIC, so we're comparing the good and bad configurations now. Primary port: enp94s0f0 Secondary port: enp94s0f1d1 A] Good config for fault-tolerance (active-backup) bonding mode:

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-10 Thread Nivedita Singhvi
The second port on the NIC definitely works as the active interface in an active-backup bonding configuration on the other NICs. At the moment, it's only this particular NIC that is seeing this problem that we know of. -- You received this bug notification because you are a member of Ubuntu

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-09 Thread Nivedita Singhvi
Hello Edwin, Here is more information on the issue we are seeing wrt dropped packets and other connectivity issues with this NIC. The problem is *only* seen when the second port on the NIC is chosen as the active interface of a active-backup configuration. So on the "bad" system with the

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-02-04 Thread Nivedita Singhvi
Hey Edwin, sorry, I didn't see your last question. I'll try and confirm but I've seen loss in both directions but it's not clear whether that's significant enough or not yet. e.g., TCP traffic is retransmitted, so it could be segments lost while outgoing or acks lost incoming. 4407

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-31 Thread Edwin Peer
I don't think bnxt_en exposes the disable_tpa parameter. Be that as it may, I think the tpa_aborts may be a red herring. TPA aggregates TCP flows and you are seeing the issue with ICMP. In which direction(s) of traffic flow do you see the losses? -- You received this bug notification because

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-31 Thread Nivedita Singhvi
> NICs between systems? Are OS / kernel and driver > versions the same on both systems? Yes, identical distro release, kernel, and most of the software stack (I have not obtained and examined the full sw stack). Configuration of networking settings is also the same. -- You received this bug

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-31 Thread Nivedita Singhvi
> There are more than one variable at play here. > Does the problem follow the NIC if you swap the > NICs between systems? Are OS / kernel and driver > versions the same on both systems? Unfortunately, I've not been able to get them to try permutations or switches, as yet, as this is still a

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-31 Thread Nivedita Singhvi
Thanks very much for helping on this, Edwin! Please let me know if there's anything specific you need. I'm asking them to disable any IPv6, LLDP traffic in their environment, and retest and collect information again. Also, I'd like to disable tpa, would this be at all useful: modprobe bnx

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-31 Thread Nivedita Singhvi
> The mtr packet loss is an interesting result. What mtr options did you use? Is this a UDP or ICMP test? The mtr command was: mtr --no-dns --report --report-cycles 60 $IP_ADDR so ICMP was going out. -- You received this bug notification because you are a member of Ubuntu Bugs, which is

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-30 Thread Edwin Peer
> 3. mtr ping test > --- > GoodSystem..0.0% Loss; 0.2 Avg; 0.1 Best, 0.9 Worst, 0.1 StdDev > BadSystem2...11.7% Loss; 0.1 Avg; 0.1 Best, 0.2 Worst, 0.0 StdDev The mtr packet loss is an interesting result. What mtr options did you use? Is this a UDP or ICMP

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-30 Thread Edwin Peer
With respect to one of these situations, this is the following system: > Dell PowerEdge R440/0XP8V5, BIOS 2.2.11 06/14/2019 > > Note that a similar system does not have any issues: > > Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.3.4 11/08/2016 > > So the NIC in the "bad" environment is: > >

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-30 Thread diarmuid
Here is the Ettus benchmark tool https://kb.ettus.com/Verifying_the_Operation_of_the_USRP_Using_UHD_and_GNU_Radio You would need an Ettus device to run those tests. I cant test the affected node now as it is in production unfortunately. -- You received this bug notification because you are a

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-29 Thread Nivedita Singhvi
** Attachment added: "active interface ethtool-S" https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1853638/+attachment/5324070/+files/ethtool-S-enp94s0f0 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-29 Thread Nivedita Singhvi
** Attachment added: "backup interface ethtool-S" https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1853638/+attachment/5324071/+files/ethtool-S-enp94s0f1d1 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-29 Thread Nivedita Singhvi
Note that iperf was identical whereas netperf and mtr showed up differences (so it's possibly sporadic as well, not continuous) 1. iperf tcp test -- GoodSystem.9.84 Gbits/sec BadSystem18.37 Gbits/sec BadSystem2...9.85 Gbits/sec 2. iperf udp test

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-29 Thread Nivedita Singhvi
Hello, Edwin, We have two separate users/customers filing reports, and I can answer for one of them. I'll ask the original poster separately as well to reply. With respect to one of these situations, this is the following system: Dell PowerEdge R440/0XP8V5, BIOS 2.2.11 06/14/2019 Note that a

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-28 Thread Edwin Peer
I am an engineer at Broadcom and have been assigned to investigate this issue. To that end, I have a few clarifying questions: 1a) What is the benchmark tool you are using and could you provide a link to where I can get it? b) What kind of network traffic is it sending? 2a) In what units are

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-28 Thread Edwin Peer
Could you also please dump the ethtool statistics for the NIC? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1853638 Title: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-17 Thread Nivedita Singhvi
(active interface) > cat ethtool-S-enp94s0f1d1 | grep abort [0]: tpa_aborts: 19775497 [1]: tpa_aborts: 26758635 [2]: tpa_aborts: 12008147 [3]: tpa_aborts: 15829167 [4]: tpa_aborts: 25099500 [5]: tpa_aborts: 3292554 [6]: tpa_aborts: 2863692 [7]: tpa_aborts:

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-17 Thread Nivedita Singhvi
We suspect this is a device (hw/fw) issue, however, not NetworkManager or kernel (driver bnxt_en). I've added the kernel for the driver impact (just in case, for now). This is really to eliminate all other causes and confirm whether it's the device at root cause). NIC Product Name:

[Bug 1853638] Re: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet device seems to be dropping data

2020-01-17 Thread Nivedita Singhvi
I have reports of the same device appearing to drop packets and incur greater number of retransmissions under certain circumstances which we're still trying to nail down. I'm using this bug for now until proven to be a different problem. This is causing issues in a production environment. **