[Kernel-packages] [Bug 1854061] Re: bnx2x update in 18.04 hwe edge kernel

2019-11-26 Thread Björn Zettergren
*** This bug is a duplicate of bug 1852512 ***
https://bugs.launchpad.net/bugs/1852512

Alright!

I manually added the missing firmware-files and my network worked as
expected after boot. Looking forward for the updated package in official
repo. Thanks!

Case closed.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/1854061

Title:
  bnx2x update in 18.04 hwe edge kernel

Status in linux-firmware package in Ubuntu:
  New

Bug description:
  Hi,

  linux-image-generic-hwe-18.04-edge version 5.3.0.23.90 seems to require 
updated bnx2x firmwares.
  Installing/configuring this kernel version complaints:

  W: Possible missing firmware /lib/firmware/bnx2x/bnx2x-e2-7.13.11.0.fw for 
module bnx2x
  W: Possible missing firmware /lib/firmware/bnx2x/bnx2x-e1h-7.13.11.0.fw for 
module bnx2x
  W: Possible missing firmware /lib/firmware/bnx2x/bnx2x-e1-7.13.11.0.fw for 
module bnx2x

  linux-firmware version 1.173.12 does not contain these files, but do
  contain older versions:

  # ls -lhatr /lib/firmware/bnx2x/bnx2x-e*7.13*
  -rw-r--r-- 1 root root 314K Nov 17  2017 
/lib/firmware/bnx2x/bnx2x-e2-7.13.1.0.fw
  -rw-r--r-- 1 root root 175K Nov 17  2017 
/lib/firmware/bnx2x/bnx2x-e1h-7.13.1.0.fw
  -rw-r--r-- 1 root root 167K Nov 17  2017 
/lib/firmware/bnx2x/bnx2x-e1-7.13.1.0.fw

  please update the firmwares for bnx2x, or is it the kernel that needs
  a bug-report to reference the older version?

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1854061/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1854061] [NEW] bnx2x update in 18.04 hwe edge kernel

2019-11-26 Thread Björn Zettergren
Public bug reported:

Hi,

linux-image-generic-hwe-18.04-edge version 5.3.0.23.90 seems to require updated 
bnx2x firmwares.
Installing/configuring this kernel version complaints:

W: Possible missing firmware /lib/firmware/bnx2x/bnx2x-e2-7.13.11.0.fw for 
module bnx2x
W: Possible missing firmware /lib/firmware/bnx2x/bnx2x-e1h-7.13.11.0.fw for 
module bnx2x
W: Possible missing firmware /lib/firmware/bnx2x/bnx2x-e1-7.13.11.0.fw for 
module bnx2x

linux-firmware version 1.173.12 does not contain these files, but do
contain older versions:

# ls -lhatr /lib/firmware/bnx2x/bnx2x-e*7.13*
-rw-r--r-- 1 root root 314K Nov 17  2017 
/lib/firmware/bnx2x/bnx2x-e2-7.13.1.0.fw
-rw-r--r-- 1 root root 175K Nov 17  2017 
/lib/firmware/bnx2x/bnx2x-e1h-7.13.1.0.fw
-rw-r--r-- 1 root root 167K Nov 17  2017 
/lib/firmware/bnx2x/bnx2x-e1-7.13.1.0.fw

please update the firmwares for bnx2x, or is it the kernel that needs a
bug-report to reference the older version?

** Affects: linux-firmware (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/1854061

Title:
  bnx2x update in 18.04 hwe edge kernel

Status in linux-firmware package in Ubuntu:
  New

Bug description:
  Hi,

  linux-image-generic-hwe-18.04-edge version 5.3.0.23.90 seems to require 
updated bnx2x firmwares.
  Installing/configuring this kernel version complaints:

  W: Possible missing firmware /lib/firmware/bnx2x/bnx2x-e2-7.13.11.0.fw for 
module bnx2x
  W: Possible missing firmware /lib/firmware/bnx2x/bnx2x-e1h-7.13.11.0.fw for 
module bnx2x
  W: Possible missing firmware /lib/firmware/bnx2x/bnx2x-e1-7.13.11.0.fw for 
module bnx2x

  linux-firmware version 1.173.12 does not contain these files, but do
  contain older versions:

  # ls -lhatr /lib/firmware/bnx2x/bnx2x-e*7.13*
  -rw-r--r-- 1 root root 314K Nov 17  2017 
/lib/firmware/bnx2x/bnx2x-e2-7.13.1.0.fw
  -rw-r--r-- 1 root root 175K Nov 17  2017 
/lib/firmware/bnx2x/bnx2x-e1h-7.13.1.0.fw
  -rw-r--r-- 1 root root 167K Nov 17  2017 
/lib/firmware/bnx2x/bnx2x-e1-7.13.1.0.fw

  please update the firmwares for bnx2x, or is it the kernel that needs
  a bug-report to reference the older version?

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1854061/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1723127] Re: Intel i40e PF reset due to incorrect MDD detection (continues...)

2017-12-06 Thread Björn Zettergren
Sorry for the delay, I've not forgotten about this, just been swamped
with other things. Will hopefully have time to do the tests next week.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1723127

Title:
  Intel i40e PF reset due to incorrect MDD detection (continues...)

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress

Bug description:
  This is a continuation from bug 1713553; a patch was added in that bug
  to attempt to fix this, and it may have helped reduce the issue but
  appears not to have fixed it, based on more reports.

  The issue is the i40e driver, when TSO is enabled, sometimes sees the
  NIC firmware issue a "MDD event" where MDD is "Malicious Driver
  Detection".  This is vaguely defined in the i40e spec, but with no way
  to tell what the NIC actually saw that it didn't like.  So, the driver
  can do nothing but print an error message and reset the PF (or VF).
  Unfortunately, this resets the interface, which causes an interruption
  in network traffic flow while the PF is resetting.

  See bug 1713553 for more details.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1723127] Re: Intel i40e PF reset due to incorrect MDD detection (continues...)

2017-11-01 Thread Björn Zettergren
No worries, we're not in a hurry.

I'd say we go with option #2. Please provide information on how to
proceed, and how to undo any changes we test :)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1723127

Title:
  Intel i40e PF reset due to incorrect MDD detection (continues...)

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress

Bug description:
  This is a continuation from bug 1713553; a patch was added in that bug
  to attempt to fix this, and it may have helped reduce the issue but
  appears not to have fixed it, based on more reports.

  The issue is the i40e driver, when TSO is enabled, sometimes sees the
  NIC firmware issue a "MDD event" where MDD is "Malicious Driver
  Detection".  This is vaguely defined in the i40e spec, but with no way
  to tell what the NIC actually saw that it didn't like.  So, the driver
  can do nothing but print an error message and reset the PF (or VF).
  Unfortunately, this resets the interface, which causes an interruption
  in network traffic flow while the PF is resetting.

  See bug 1713553 for more details.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1723127] Re: Intel i40e PF reset due to incorrect MDD detection (continues...)

2017-10-19 Thread Björn Zettergren
kernel v4.10, (4.10.0-041000-generic) has been running fine, without any
issues since 24 hours. I'd say it's OK, as you suspected.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1723127

Title:
  Intel i40e PF reset due to incorrect MDD detection (continues...)

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress

Bug description:
  This is a continuation from bug 1713553; a patch was added in that bug
  to attempt to fix this, and it may have helped reduce the issue but
  appears not to have fixed it, based on more reports.

  The issue is the i40e driver, when TSO is enabled, sometimes sees the
  NIC firmware issue a "MDD event" where MDD is "Malicious Driver
  Detection".  This is vaguely defined in the i40e spec, but with no way
  to tell what the NIC actually saw that it didn't like.  So, the driver
  can do nothing but print an error message and reset the PF (or VF).
  Unfortunately, this resets the interface, which causes an interruption
  in network traffic flow while the PF is resetting.

  See bug 1713553 for more details.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1723127] Re: Intel i40e PF reset due to incorrect MDD detection (continues...)

2017-10-18 Thread Björn Zettergren
> one bug at a time, please.

Absolutely! I just mentioned the "GRO implementation" because I wondered
if it might have been related. I should have googled up better on it
beforehand, that would have enlightened me that it wasn't.

I've tested the v4.4-wily kernel in the first link
(4.4.0-040400-generic), and it failed miserably directly after the
machine came online. I'm attaching a redacted syslog with relevant
messages in it. One thing you'll note is that the i40e driver (1.3.x)
complains that the firmware is too new, this might be a problem(?), but
there's also a message, just before the "TX driver issue detected":

i40e :02:00.1: FD filter programming failed due to incorrect filter
parameters

See the attached file for more details.

We're currently running the second kernel v4.10,
(4.10.0-041000-generic), and it's running fine so far, but the machine
has only been up for 30 minutes, i'll let it run 24 hours, and report
back tomorrow, or as soon as status changes, if at all.

** Attachment added: "redacted_i40e_syslog.txt"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+attachment/4974593/+files/redacted_i40e_syslog.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1723127

Title:
  Intel i40e PF reset due to incorrect MDD detection (continues...)

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress

Bug description:
  This is a continuation from bug 1713553; a patch was added in that bug
  to attempt to fix this, and it may have helped reduce the issue but
  appears not to have fixed it, based on more reports.

  The issue is the i40e driver, when TSO is enabled, sometimes sees the
  NIC firmware issue a "MDD event" where MDD is "Malicious Driver
  Detection".  This is vaguely defined in the i40e spec, but with no way
  to tell what the NIC actually saw that it didn't like.  So, the driver
  can do nothing but print an error message and reset the PF (or VF).
  Unfortunately, this resets the interface, which causes an interruption
  in network traffic flow while the PF is resetting.

  See bug 1713553 for more details.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1723127] Re: Intel i40e PF reset due to incorrect MDD detection (continues...)

2017-10-13 Thread Björn Zettergren
As of now, we've been running HWE 4.10 for little more than 16 hours and
no problems so far. Previously we'd hit the problem within the hour.

There is however one new logmessage that we haven't seen before, neither
with 1.4.x driver or 2.0.x. But it might be unrelated, we can't see any
particular performance-issues in any of our monitoring/graphs. And the
message is:

TCP: bond0.5: Driver has suspect GRO implementation, TCP performance may
be compromised.

How do we proceed? :-)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1723127

Title:
  Intel i40e PF reset due to incorrect MDD detection (continues...)

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress

Bug description:
  This is a continuation from bug 1713553; a patch was added in that bug
  to attempt to fix this, and it may have helped reduce the issue but
  appears not to have fixed it, based on more reports.

  The issue is the i40e driver, when TSO is enabled, sometimes sees the
  NIC firmware issue a "MDD event" where MDD is "Malicious Driver
  Detection".  This is vaguely defined in the i40e spec, but with no way
  to tell what the NIC actually saw that it didn't like.  So, the driver
  can do nothing but print an error message and reset the PF (or VF).
  Unfortunately, this resets the interface, which causes an interruption
  in network traffic flow while the PF is resetting.

  See bug 1713553 for more details.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1723127] Re: Intel i40e PF reset due to incorrect MDD detection (continues...)

2017-10-12 Thread Björn Zettergren
We've been using hwe-edge 4.11 for almost 24 hours without problems.
We'll test the regular hwe 4.10 also if you think that narrows the
bisect.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1723127

Title:
  Intel i40e PF reset due to incorrect MDD detection (continues...)

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress

Bug description:
  This is a continuation from bug 1713553; a patch was added in that bug
  to attempt to fix this, and it may have helped reduce the issue but
  appears not to have fixed it, based on more reports.

  The issue is the i40e driver, when TSO is enabled, sometimes sees the
  NIC firmware issue a "MDD event" where MDD is "Malicious Driver
  Detection".  This is vaguely defined in the i40e spec, but with no way
  to tell what the NIC actually saw that it didn't like.  So, the driver
  can do nothing but print an error message and reset the PF (or VF).
  Unfortunately, this resets the interface, which causes an interruption
  in network traffic flow while the PF is resetting.

  See bug 1713553 for more details.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1713553] Re: Intel i40e PF reset due to incorrect MDD detection

2017-10-11 Thread Björn Zettergren
Hi,

Thanks for your efforts with this issue, however we're still
experiencing problems with the newest kernel. Sorry about missing the
patch-testing-window, we should have been there for you :)

After only 20 minutes of runtime with the new kernel, we saw the
following, and networking is basically useless:

[2.410644] i40e: Intel(R) Ethernet Connection XL710 Network Driver - 
version 1.4.25-k
[2.419791] i40e: Copyright (c) 2013 - 2014 Intel Corporation.
[2.483362] i40e :02:00.0: fw 5.40.47690 api 1.5 nvm 5.40 0x80002d35 
18.0.16
[2.896678] i40e :02:00.0: MAC address: 3c:fd:fe:1a:b5:e0
[2.903768] i40e :02:00.0: SAN MAC: 3c:fd:fe:1a:b5:e1
[3.189818] i40e :02:00.0: PCI-Express: Speed 8.0GT/s Width x4
[3.193934] i40e :02:00.0: PCI-Express bandwidth available for this 
device may be insufficient for optimal performance.
[3.202198] i40e :02:00.0: Please move the device to a different PCI-e 
link with more lanes and/or higher transfer rate.
[3.241095] i40e :02:00.0: Features: PF-id[0] VFs: 64 VSIs: 2 QP: 4 RX: 
1BUF RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[3.279202] i40e :02:00.1: fw 5.40.47690 api 1.5 nvm 5.40 0x80002d35 
18.0.16
[3.531346] i40e :02:00.1: MAC address: 3c:fd:fe:1a:b5:e2
[3.539557] i40e :02:00.1: SAN MAC: 3c:fd:fe:1a:b5:e3
[3.761719] i40e :02:00.1: PCI-Express: Speed 8.0GT/s Width x4
[3.765721] i40e :02:00.1: PCI-Express bandwidth available for this 
device may be insufficient for optimal performance.
[3.773539] i40e :02:00.1: Please move the device to a different PCI-e 
link with more lanes and/or higher transfer rate.
[3.812022] i40e :02:00.1: Features: PF-id[1] VFs: 64 VSIs: 2 QP: 4 RX: 
1BUF RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[3.855168] i40e :02:00.0 p1p1: renamed from eth2
[3.895278] i40e :02:00.1 p1p2: renamed from eth0
[7.205832] i40e :02:00.1 p1p2: already using mac address 
3c:fd:fe:1a:b5:e2
[7.208378] i40e :02:00.1 p1p2: NIC Link is Up 10 Gbps Full Duplex, Flow 
Control: None
[7.208401] i40e :02:00.1 p1p2: adding 3c:fd:fe:1a:b5:e2 vid=0
[7.208453] i40e :02:00.0 p1p1: set new mac address 3c:fd:fe:1a:b5:e2
[7.217191] i40e :02:00.0 p1p1: NIC Link is Up 10 Gbps Full Duplex, Flow 
Control: None
[7.217215] i40e :02:00.0 p1p1: adding 3c:fd:fe:1a:b5:e2 vid=0
[7.240919] i40e :02:00.1 p1p2: set new mac address 3c:fd:fe:1a:b5:e0
[7.252720] i40e :02:00.0 p1p1: returning to hw mac address 
3c:fd:fe:1a:b5:e0
[7.324791] i40e :02:00.1 p1p2: adding 3c:fd:fe:1a:b5:e0 vid=5
[7.324798] i40e :02:00.0 p1p1: adding 3c:fd:fe:1a:b5:e0 vid=5
[ 1109.574733] i40e :02:00.1: TX driver issue detected, PF reset issued
[ 1110.011152] i40e :02:00.1 p1p2: adding 3c:fd:fe:1a:b5:e0 vid=0
[ 1110.011155] i40e :02:00.1 p1p2: adding 3c:fd:fe:1a:b5:e0 vid=5
[ 1110.013749] i40e :02:00.1: TX driver issue detected, PF reset issued
[ 1110.013773] i40e :02:00.1 p1p2: speed changed to 0 for port p1p2
[ 1110.013954] bond0: link status up again after 0 ms for interface p1p2
[ 1110.983823] i40e :02:00.1 p1p2: adding 3c:fd:fe:1a:b5:e0 vid=0
[ 1110.983825] i40e :02:00.1 p1p2: adding 3c:fd:fe:1a:b5:e0 vid=5
[ 1110.985836] bond0: link status up again after 0 ms for interface p1p2
[ .432231] i40e :02:00.0: TX driver issue detected, PF reset issued
[ .981828] i40e :02:00.0 p1p1: adding 3c:fd:fe:1a:b5:e0 vid=0
[ .981835] i40e :02:00.0 p1p1: adding 3c:fd:fe:1a:b5:e0 vid=5
[ .984816] i40e :02:00.0: TX driver issue detected, PF reset issued
[ .987007] bond0: link status up again after 0 ms for interface p1p1
[ 1112.981796] i40e :02:00.0 p1p1: adding 3c:fd:fe:1a:b5:e0 vid=0
[ 1112.981803] i40e :02:00.0 p1p1: adding 3c:fd:fe:1a:b5:e0 vid=5
[ 1112.985812] bond0: link status up again after 0 ms for interface p1p1
[ 1114.204548] i40e :02:00.1: TX driver issue detected, PF reset issued
[ 1114.983686] i40e :02:00.1 p1p2: adding 3c:fd:fe:1a:b5:e0 vid=0
[ 1114.983688] i40e :02:00.1 p1p2: adding 3c:fd:fe:1a:b5:e0 vid=5
[ 1114.985692] bond0: link status up again after 0 ms for interface p1p2
[ 1115.752686] i40e :02:00.1: TX driver issue detected, PF reset issued
[ 1116.985619] i40e :02:00.1 p1p2: adding 3c:fd:fe:1a:b5:e0 vid=0
[ 1116.985624] i40e :02:00.1 p1p2: adding 3c:fd:fe:1a:b5:e0 vid=5
[ 1116.988361] i40e :02:00.1 p1p2: speed changed to 0 for port p1p2
[ 1116.989607] bond0: link status up again after 0 ms for interface p1p2

# uname -a
Linux lb05 4.4.0-97-generic #120-Ubuntu SMP Tue Sep 19 17:28:18 UTC 2017 x86_64 
x86_64 x86_64 GNU/Linux

# modinfo i40e
filename:   
/lib/modules/4.4.0-97-generic/kernel/drivers/net/ethernet/intel/i40e/i40e.ko
version:1.4.25-k

As a workaround we're using i40e driver v2.0.30 via dkms, which does
works fine without any issues so far, but it would be nice to have this
proble

Re: [Kernel-packages] [Bug 1700834] Re: Intel i40e PF reset under load

2017-08-29 Thread Björn Zettergren
>
> There is one additional upstream commit required to fully fix this,
> please see bug 1713553.
>

Ah, nice! Thanks for pointing it out, i had not found it myself. I solved
my problems temporarily by adding i40e 2.0.30 driver as dkms to my current
system and will follow the other bug (the i40e 2.1.26 leaked memory at an
alarming rate, but that's a story for a different bugreport maybe).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1700834

Title:
  Intel i40e PF reset under load

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Xenial:
  Fix Released

Bug description:
  SRU Justification:

  Impact:

Using an Intel i40e network device, under heavy traffic load with
  TSO enabled, the device will spontaneously reset itself and issue errors
  similar to the following:

  Jun 14 14:09:51 hostname kernel: [4253913.851053] i40e :05:00.1: TX 
driver issue detected, PF reset issued 
  Jun 14 14:09:53 hostname kernel: [4253915.476283] i40e :05:00.1: TX 
driver issue detected, PF reset issued 
  Jun 14 14:09:54 hostname kernel: [4253917.411264] i40e :05:00.1: TX 
driver issue detected, PF reset issued 

This causes a full reset of the PF, which causes an interruption
  in traffic flow.

In this case, these errors arise from a bug in the i40e device
  driver introduced by commit:

  commit 584a837e26408c66e87df87a022faa6a54c2b020
  Author: Alexander Duyck 
  Date:   Wed Feb 17 11:02:50 2016 -0800

  i40e/i40evf: Rewrite logic for 8 descriptor per packet check

This patch was added to the Xenial kernel beginning with version
  4.4.0-8.23.  This bug does not manifest on any other Ubuntu kernel series.

  
  Fix:

  This error is resolved upstream by:

  commit 3f3f7cb875c0f621485644d4fd7453b0d37f00e4
  Author: Alexander Duyck 
  Date:   Wed Mar 30 16:15:37 2016 -0700

  i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per 
packet
  
This fix was never backported into the Xenial 4.4 kernel series.
  

  Testcase:

In this case, the issue occurs at a customer site using i40e based
  Intel network cards with SR-IOV enabled.  Under heavy load, the card will
  reset itself as described.  The customer has tested the 3f3f7cb875c patch
  in their environment and confirmed that it resolves the issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1700834/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1700834] Re: Intel i40e PF reset under load

2017-08-26 Thread Björn Zettergren
I forgot to mention in previous comment, that this happens within the
hour (usually just a few minutes) of adding the server to production
loads. I can provide more information and test patches if necessary.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1700834

Title:
  Intel i40e PF reset under load

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Xenial:
  Fix Released

Bug description:
  SRU Justification:

  Impact:

Using an Intel i40e network device, under heavy traffic load with
  TSO enabled, the device will spontaneously reset itself and issue errors
  similar to the following:

  Jun 14 14:09:51 hostname kernel: [4253913.851053] i40e :05:00.1: TX 
driver issue detected, PF reset issued 
  Jun 14 14:09:53 hostname kernel: [4253915.476283] i40e :05:00.1: TX 
driver issue detected, PF reset issued 
  Jun 14 14:09:54 hostname kernel: [4253917.411264] i40e :05:00.1: TX 
driver issue detected, PF reset issued 

This causes a full reset of the PF, which causes an interruption
  in traffic flow.

In this case, these errors arise from a bug in the i40e device
  driver introduced by commit:

  commit 584a837e26408c66e87df87a022faa6a54c2b020
  Author: Alexander Duyck 
  Date:   Wed Feb 17 11:02:50 2016 -0800

  i40e/i40evf: Rewrite logic for 8 descriptor per packet check

This patch was added to the Xenial kernel beginning with version
  4.4.0-8.23.  This bug does not manifest on any other Ubuntu kernel series.

  
  Fix:

  This error is resolved upstream by:

  commit 3f3f7cb875c0f621485644d4fd7453b0d37f00e4
  Author: Alexander Duyck 
  Date:   Wed Mar 30 16:15:37 2016 -0700

  i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per 
packet
  
This fix was never backported into the Xenial 4.4 kernel series.
  

  Testcase:

In this case, the issue occurs at a customer site using i40e based
  Intel network cards with SR-IOV enabled.  Under heavy load, the card will
  reset itself as described.  The customer has tested the 3f3f7cb875c patch
  in their environment and confirmed that it resolves the issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1700834/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1700834] Re: Intel i40e PF reset under load

2017-08-25 Thread Björn Zettergren
I'm running Xenial with kernel 4.4.0-92.115 on a Dell R330 with intel X710 NIC.
Under load it fails with message: "TX driver issue detected, PF reset issued" 
as original report says.

To me it looks like this issue isn't completely solved since I'm running
a more recent kernel than the "fix" was commited to.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1700834

Title:
  Intel i40e PF reset under load

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Xenial:
  Fix Released

Bug description:
  SRU Justification:

  Impact:

Using an Intel i40e network device, under heavy traffic load with
  TSO enabled, the device will spontaneously reset itself and issue errors
  similar to the following:

  Jun 14 14:09:51 hostname kernel: [4253913.851053] i40e :05:00.1: TX 
driver issue detected, PF reset issued 
  Jun 14 14:09:53 hostname kernel: [4253915.476283] i40e :05:00.1: TX 
driver issue detected, PF reset issued 
  Jun 14 14:09:54 hostname kernel: [4253917.411264] i40e :05:00.1: TX 
driver issue detected, PF reset issued 

This causes a full reset of the PF, which causes an interruption
  in traffic flow.

In this case, these errors arise from a bug in the i40e device
  driver introduced by commit:

  commit 584a837e26408c66e87df87a022faa6a54c2b020
  Author: Alexander Duyck 
  Date:   Wed Feb 17 11:02:50 2016 -0800

  i40e/i40evf: Rewrite logic for 8 descriptor per packet check

This patch was added to the Xenial kernel beginning with version
  4.4.0-8.23.  This bug does not manifest on any other Ubuntu kernel series.

  
  Fix:

  This error is resolved upstream by:

  commit 3f3f7cb875c0f621485644d4fd7453b0d37f00e4
  Author: Alexander Duyck 
  Date:   Wed Mar 30 16:15:37 2016 -0700

  i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per 
packet
  
This fix was never backported into the Xenial 4.4 kernel series.
  

  Testcase:

In this case, the issue occurs at a customer site using i40e based
  Intel network cards with SR-IOV enabled.  Under heavy load, the card will
  reset itself as described.  The customer has tested the 3f3f7cb875c patch
  in their environment and confirmed that it resolves the issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1700834/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp