I am having the same issue. Unfortunately the systems are already in
production so testing is a bit limited and the issue was not present at
first (not sure why or we didn't notice somehow).
I can confirm that I am seeing approx. 8% loss and huge latency > 1s sometimes.
Both tested with Ubuntu 18.
The issue we have reported is easily avoided by specifying
the primary port to be the active interface of the bond.
On netplan-using systems:
Add the directive "primary: $interface" (e.g. "primary: p94s0f0")
to the "parameters:" section of the netplan config file.
--
You received this bug noti
Hello, diarmuid,
Re: original issue report, were you able to resolve your issue?
Please let us know.
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to network-manager in Ubuntu.
https://bugs.launchpad.net/bugs/1853638
Title:
We are closing this LP bug for now as we aren't able to reproduce
in-house, and we cannot get access to a live testing repro env
at this time.
Here is what we know:
- There seems to be different performance for some tests when
the NIC is configured with active-backup bonding mode, between
Regarding your question about LLDP and IPv6...
The default Ubuntu 18.04.3 configuration has an IPv6 enabled kernel, but
the interface only has the default link local address configured. I've
seen it do router solicitation on link state changes and periodically
thereafter. I think I recall seeing s
The tpa_aborts shouldn't be a concern. They merely indicate that a TCP
flow could not be aggregated. That could have a performance impact, of
course, but that should manifest as counted drops somewhere if this were
the case.
Importantly, the tpa_aborts only apply to TCP traffic, but you see the
pr
Edwin,
Do you happen to notice any IPv6 or LLDP or other link-local traffic
on the interfaces? (including backup interface).
The MTR loss % is purely a capture of their packets xmitted
and responses received, so for that UDP MTR test, this is saying
that UDP packets were lost, somewhere.
The
I have tried, unsuccessfully, to reproduce this issue internally.
Details of my setup below.
1) I have a pair of Dell R210 servers racked (u072 and u073 below), each
with a BCM57416 installed:
root@u072:~# lspci | grep BCM57416
01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416
Additional observations.
MAAS is being used to deploy the system and configure
the bond interface and settings.
MAAS allows you to specify which is the primary interface, with
the other being the backup, for the active-backup bonding mode.
However, it does not appear to be working -it's not passi
** Attachment added: "ethtool -S for inactive interface enp94s0f0"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1853638/+attachment/5327556/+files/ethtool-S-enp94s0f0
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to
ethtool-enp94s0f0
--
Settings for enp94s0f0:
Supported ports: [ FIBRE ]
Supported link modes: 1baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Adv
Edwin, let me know if you can get in touch with me via the contact email
on my Launchpad page. Thanks for all the help!
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to network-manager in Ubuntu.
https://bugs.launchpad.net/bug
"Bad" System/NIC:
NIC: BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller
System: Dell
Kernel: 5.3.0-28-generic #30~18.04.1-Ubuntu
(Note, this issue has been seen on prior kernels as well, upgraded
to latest to see if various problems were resolved)
Attaching stats/config files from
Good System/Good NIC (all configurations work) Comparison
NIC: NetXtreme II BCM57000 10 Gigabit Ethernet QLogic 57000
System: Dell
Kernel: 5.0.0-25-generic #26~18.04.1-Ubuntu
/proc/net/bonding/bond0
---
Ethernet Chan
"Bad" Configuration for active-backup mode:
$ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: enp94s0f1d1
Hi Nivedita,
I have been away on PTO the last week and am picking this up again now.
Please could you post the full bonding configuration?
Regards,
Edwin Peer
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to network-manager i
We have narrowed it down to a flaw in a specific configuration setting
on this NIC, so we're comparing the good and bad configurations now.
Primary port: enp94s0f0
Secondary port: enp94s0f1d1
A] Good config for fault-tolerance (active-backup) bonding mode:
---
The second port on the NIC definitely works as the active
interface in an active-backup bonding configuration on the
other NICs.
At the moment, it's only this particular NIC that is seeing
this problem that we know of.
--
You received this bug notification because you are a member of Ubuntu
Touc
Hello Edwin,
Here is more information on the issue we are seeing wrt dropped
packets and other connectivity issues with this NIC.
The problem is *only* seen when the second port on the NIC is
chosen as the active interface of a active-backup configuration.
So on the "bad" system with the inte
Hey Edwin, sorry, I didn't see your last question.
I'll try and confirm but I've seen loss in both
directions but it's not clear whether that's significant
enough or not yet.
e.g., TCP traffic is retransmitted, so it could be segments
lost while outgoing or acks lost incoming.
4407 retransmit
I don't think bnxt_en exposes the disable_tpa parameter. Be that as it
may, I think the tpa_aborts may be a red herring. TPA aggregates TCP
flows and you are seeing the issue with ICMP.
In which direction(s) of traffic flow do you see the losses?
--
You received this bug notification because you
> NICs between systems? Are OS / kernel and driver
> versions the same on both systems?
Yes, identical distro release, kernel, and most of the software
stack (I have not obtained and examined the full sw stack).
Configuration of networking settings is also the same.
--
You received this bug n
> There are more than one variable at play here.
> Does the problem follow the NIC if you swap the
> NICs between systems? Are OS / kernel and driver
> versions the same on both systems?
Unfortunately, I've not been able to get them to try
permutations or switches, as yet, as this is still a
pr
Thanks very much for helping on this, Edwin! Please let me
know if there's anything specific you need.
I'm asking them to disable any IPv6, LLDP traffic in their environment,
and retest and collect information again.
Also, I'd like to disable tpa, would this be at all useful:
modprobe bnx disab
> The mtr packet loss is an interesting result. What mtr options did you
use? Is this a UDP or ICMP test?
The mtr command was:
mtr --no-dns --report --report-cycles 60 $IP_ADDR
so ICMP was going out.
--
You received this bug notification because you are a member of Ubuntu
Touch seeded package
> 3. mtr ping test
> ---
> GoodSystem..0.0% Loss; 0.2 Avg; 0.1 Best, 0.9 Worst, 0.1 StdDev
> BadSystem2...11.7% Loss; 0.1 Avg; 0.1 Best, 0.2 Worst, 0.0 StdDev
The mtr packet loss is an interesting result. What mtr options did you
use? Is this a UDP or ICMP test?
With respect to one of these situations, this is the following system:
> Dell PowerEdge R440/0XP8V5, BIOS 2.2.11 06/14/2019
>
> Note that a similar system does not have any issues:
>
> Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.3.4 11/08/2016
>
> So the NIC in the "bad" environment is:
>
> BCM574
Here is the Ettus benchmark tool
https://kb.ettus.com/Verifying_the_Operation_of_the_USRP_Using_UHD_and_GNU_Radio
You would need an Ettus device to run those tests.
I cant test the affected node now as it is in production unfortunately.
--
You received this bug notification because you are a me
** Attachment added: "backup interface ethtool-S"
https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1853638/+attachment/5324071/+files/ethtool-S-enp94s0f1d1
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to netw
** Attachment added: "active interface ethtool-S"
https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1853638/+attachment/5324070/+files/ethtool-S-enp94s0f0
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to networ
Note that iperf was identical whereas netperf and mtr showed
up differences (so it's possibly sporadic as well, not continuous)
1. iperf tcp test
--
GoodSystem.9.84 Gbits/sec
BadSystem18.37 Gbits/sec
BadSystem2...9.85 Gbits/sec
2. iperf udp test
-
Hello, Edwin,
We have two separate users/customers filing reports, and I can answer for
one of them. I'll ask the original poster separately as well to reply.
With respect to one of these situations, this is the following system:
Dell PowerEdge R440/0XP8V5, BIOS 2.2.11 06/14/2019
Note that a si
I am an engineer at Broadcom and have been assigned to investigate this
issue. To that end, I have a few clarifying questions:
1a) What is the benchmark tool you are using and could you provide a
link to where I can get it?
b) What kind of network traffic is it sending?
2a) In what units are th
Could you also please dump the ethtool statistics for the NIC?
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to network-manager in Ubuntu.
https://bugs.launchpad.net/bugs/1853638
Title:
BCM57416 NetXtreme-E Dual-Media 10G RD
(active interface)
> cat ethtool-S-enp94s0f1d1 | grep abort
[0]: tpa_aborts: 19775497
[1]: tpa_aborts: 26758635
[2]: tpa_aborts: 12008147
[3]: tpa_aborts: 15829167
[4]: tpa_aborts: 25099500
[5]: tpa_aborts: 3292554
[6]: tpa_aborts: 2863692
[7]: tpa_aborts: 2
We suspect this is a device (hw/fw) issue, however, not NetworkManager
or kernel (driver bnxt_en). I've added the kernel for the driver impact
(just in case, for now). This is really to eliminate all other causes
and confirm whether it's the device at root cause).
NIC
Product Name: Bro
I have reports of the same device appearing to drop packets and incur
greater number of retransmissions under certain circumstances which
we're still trying to nail down.
I'm using this bug for now until proven to be a different problem.
This is causing issues in a production environment.
** Ch
37 matches
Mail list logo