@stefan-n1, please move discussion over to bug 1723127, no more comments
should be added to this bug.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1713553

Title:
  Intel i40e PF reset due to incorrect MDD detection

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  Fix Released

Bug description:
  [Impact]

  Using an Intel i40e network device, under heavy traffic load with
  TSO enabled, the device will spontaneously reset itself and issue errors
  similar to the following:

  Jun 14 14:09:51 hostname kernel: [4253913.851053] i40e 0000:05:00.1: TX 
driver issue detected, PF reset issued
  Jun 14 14:09:53 hostname kernel: [4253915.476283] i40e 0000:05:00.1: TX 
driver issue detected, PF reset issued
  Jun 14 14:09:54 hostname kernel: [4253917.411264] i40e 0000:05:00.1: TX 
driver issue detected, PF reset issued

   This causes a full reset of the PF, which causes an interruption
  in traffic flow.

  This was partially fixed by Xenial commit
  12f8cc59d5886b86372f45290166deca57a60d7a, however there is one
  additional upstream commit required to fully fix the issue:

  commit 841493a3f64395b60554afbcaa17f4350f90e764
  Author: Alexander Duyck <alexander.h.du...@intel.com>
  Date:   Tue Sep 6 18:05:04 2016 -0700

      i40e: Limit TX descriptor count in cases where frag size is
  greater than 16K

   This fix was never backported into the Xenial 4.4 kernel series, but
  is already present in the Xenial HWE (and Zesty) 4.10 kernel.

  [Testcase]

   In this case, the issue occurs at a customer site using i40e based
  Intel network cards with SR-IOV enabled. Under heavy load, the card will
  reset itself as described.

  [Regression Potential]

  As with any change to a network card driver, this may cause
  regressions with network I/O through i40e card(s).  However, this
  specific change only increases the likelyhood that any specific large
  TSO tx will need to be linearized, which will avoid the PF reset.
  Linearizing a TSO tx that did not need to be linearized will not cause
  any failures, it may only decrease performance slightly.  However this
  patch should only cause linearization when required to avoid the MDD
  detection and PF reset.

  [Other Info]

  The previous bug for this issue is bug 1700834.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1713553/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to