Marking status on B/X/D as Incomplete.
(email below sent to kernel-team mailing list
as replies to both patch series above).
Please hold / don't apply this patch for now.
The reporter hit an apparently unrelated Oops in 3 of 40 nodes,
and it hasn't been possible yet to determine whether this patch
is at all related or at fault, due to timing/deployment matters
preventing a methodical approach to revert to a original kernel.
Since the patch is recent even in the mainline kernel, holding
it up for a bit seemed to be the most prudent action for LTSes
and thus drop the patch which would be required on Disco too.
We'll be following up on this as possible on the reporter's end.
** Changed in: linux (Ubuntu Xenial)
Status: In Progress => Incomplete
** Changed in: linux (Ubuntu Bionic)
Status: In Progress => Incomplete
** Changed in: linux (Ubuntu Disco)
Status: In Progress => Incomplete
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840789
Title:
bnx2x: fatal hardware error/reboot/tx timeout with LLDP enabled
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Xenial:
Incomplete
Status in linux source package in Bionic:
Incomplete
Status in linux source package in Disco:
Incomplete
Status in linux source package in Eoan:
Fix Released
Bug description:
[Impact]
* The bnx2x driver may cause hardware faults (leading to
panic/reboot) and other behaviors as transmit timeouts,
after commit 3968d38917eb ("bnx2x: Fix Multi-Cos.") is
introduced.
* This issue has been observed by an user shortly
after starting docker & kubelet, with adapters:
- Broadcom NetXtreme II BCM57800 [14e4:168a] from Dell [1028:1f5c]
- Broadcom NetXtreme II BCM57840 [14e4:16a1] from Dell [1028:1f79]
* If options to ignore hardware faults are used
(erst_disable=1 hest_disable=1 ghes.disable=1)
the system doesn't panic/reboot and continues
on to timeout on adapter stats, then transmit
timeouts, spewing some adapter firmware dumps,
but the network interface is non-functional.
* The issue only happened when LLDP is enabled
on the network switches, and crashdump shows
the bnx2x driver is stuck/waits for firmware
to complete the stop traffic command in LLDP
handling. Workaround used is to disable LLDP
in the network switches/ports.
* Analysis of the driver and firmware dumps
didn't help significantly towards finding
the root cause.
* Upstream/mainline recently just reverted the
patch, due to similar problem reports, while
looking for the root cause/proper fix.
[Test Case]
* No reproducible test case found outside
the user's systems/cluster, where it is
enough to start docker & kubelet & wait.
* The user verified test kernels for Xenial
and Bionic - the problem does not happen;
build-tested on Disco.
[Regression Potential]
* Users who significantly use/apply the non-default
traffic class (tc) / class of service (cos) might
possibly see performance changes (if any at all)
in such applications, however that's unclear now.
* This is a recent revert upstream (v5.3-rc'ish),
so there's chance things might change in this area.
* Nonetheless, the patch is authored by the driver
vendor, and made its way into stable kernels
(e.g., v5.2.8 which made Eoan/19.10 recently).
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840789/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp