Ubuntu 19.04 reached end of life on January 23, 2020.  I am therefore
marking this bug 'wontfix' for that release.

** Changed in: linux (Ubuntu Disco)
       Status: Incomplete => Won't Fix

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1840789

Title:
  bnx2x: fatal hardware error/reboot/tx timeout with LLDP enabled

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Incomplete
Status in linux source package in Bionic:
  Incomplete
Status in linux source package in Disco:
  Won't Fix
Status in linux source package in Eoan:
  Fix Released

Bug description:
  [Impact]

   * The bnx2x driver may cause hardware faults (leading to
     panic/reboot) and other behaviors as transmit timeouts,
     after commit 3968d38917eb ("bnx2x: Fix Multi-Cos.") is
     introduced.

   * This issue has been observed by an user shortly
     after starting docker & kubelet, with adapters:
     - Broadcom NetXtreme II BCM57800 [14e4:168a] from Dell [1028:1f5c]
     - Broadcom NetXtreme II BCM57840 [14e4:16a1] from Dell [1028:1f79]

   * If options to ignore hardware faults are used
     (erst_disable=1 hest_disable=1 ghes.disable=1)
     the system doesn't panic/reboot and continues
     on to timeout on adapter stats, then transmit
     timeouts, spewing some adapter firmware dumps,
     but the network interface is non-functional.

   * The issue only happened when LLDP is enabled
     on the network switches, and crashdump shows
     the bnx2x driver is stuck/waits for firmware
     to complete the stop traffic command in LLDP
     handling. Workaround used is to disable LLDP
     in the network switches/ports.

   * Analysis of the driver and firmware dumps
     didn't help significantly towards finding
     the root cause.

   * Upstream/mainline recently just reverted the
     patch, due to similar problem reports, while
     looking for the root cause/proper fix.

  [Test Case]

   * No reproducible test case found outside
     the user's systems/cluster, where it is
     enough to start docker & kubelet & wait.

   * The user verified test kernels for Xenial
     and Bionic - the problem does not happen;
     build-tested on Disco.

  [Regression Potential]

   * Users who significantly use/apply the non-default
     traffic class (tc) / class of service (cos) might
     possibly see performance changes (if any at all)
     in such applications, however that's unclear now.

   * This is a recent revert upstream (v5.3-rc'ish),
     so there's chance things might change in this area.

   * Nonetheless, the patch is authored by the driver
     vendor, and made its way into stable kernels
     (e.g., v5.2.8 which made Eoan/19.10 recently).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840789/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to