Public bug reported:

[impact]

The i40e driver sometimes causes a "malicious device" event that the
firmware detects, which causes the firmware to reset the nic, causing an
interruption in the network connection - which can cause further
problems, e.g. if the interface is in a bond; the reset will at least
cause a temporary interruption in network traffic.

[fix]

The fix for this is currently unknown.  As the "MDD event" is generated
by the i40e firmware, and is completely undocumented, there is no way to
tell what the i40e driver did to cause the MDD event.

[test case]

the bug is unfortunately very difficult to reproduce, but as shown in
this (and previous) bug comments, some users of the i40e have traffic
that can consistently reproduce the problem (although usually on the
order of days, or longer, to reproduce). Reproducing is easily detected,
as the nw traffic will be interrupted and the system logs will contain a
message like:

i40e 0000:02:00.1: TX driver issue detected, PF reset issued

[regression potential]

unknown since the specific fix is unknown.

[original description]

This is a continuation from bug 1713553 and then bug 1723127; a patch
was added in the first bug and then the second bug, to attempt to fix
this, and it may have helped reduce the issue but appears not to have
fixed it, based on more reports.

See bug 1713553 and bug 1723127 for more details.

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: Dan Streetman (ddstreet)
         Status: In Progress

** Affects: linux (Ubuntu Xenial)
     Importance: Undecided
     Assignee: Dan Streetman (ddstreet)
         Status: In Progress

** Affects: linux (Ubuntu Bionic)
     Importance: Undecided
     Assignee: Dan Streetman (ddstreet)
         Status: In Progress

** Affects: linux (Ubuntu Cosmic)
     Importance: Undecided
     Assignee: Dan Streetman (ddstreet)
         Status: In Progress

** Also affects: linux (Ubuntu Cosmic)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu Xenial)
     Assignee: (unassigned) => Dan Streetman (ddstreet)

** Changed in: linux (Ubuntu Bionic)
     Assignee: (unassigned) => Dan Streetman (ddstreet)

** Changed in: linux (Ubuntu Cosmic)
     Assignee: (unassigned) => Dan Streetman (ddstreet)

** Changed in: linux (Ubuntu Cosmic)
       Status: New => In Progress

** Changed in: linux (Ubuntu Bionic)
       Status: New => In Progress

** Changed in: linux (Ubuntu Xenial)
       Status: New => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1772675

Title:
  Intel i40e PF reset due to incorrect MDD detection
  (continues...again...)

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  In Progress

Bug description:
  [impact]

  The i40e driver sometimes causes a "malicious device" event that the
  firmware detects, which causes the firmware to reset the nic, causing
  an interruption in the network connection - which can cause further
  problems, e.g. if the interface is in a bond; the reset will at least
  cause a temporary interruption in network traffic.

  [fix]

  The fix for this is currently unknown.  As the "MDD event" is
  generated by the i40e firmware, and is completely undocumented, there
  is no way to tell what the i40e driver did to cause the MDD event.

  [test case]

  the bug is unfortunately very difficult to reproduce, but as shown in
  this (and previous) bug comments, some users of the i40e have traffic
  that can consistently reproduce the problem (although usually on the
  order of days, or longer, to reproduce). Reproducing is easily
  detected, as the nw traffic will be interrupted and the system logs
  will contain a message like:

  i40e 0000:02:00.1: TX driver issue detected, PF reset issued

  [regression potential]

  unknown since the specific fix is unknown.

  [original description]

  This is a continuation from bug 1713553 and then bug 1723127; a patch
  was added in the first bug and then the second bug, to attempt to fix
  this, and it may have helped reduce the issue but appears not to have
  fixed it, based on more reports.

  See bug 1713553 and bug 1723127 for more details.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1772675/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to