[Kernel-packages] [Bug 1937133] Re: devlink_port_split from ubuntu_kernel_selftests.net fails on hirsute (KeyError: 'flavour')

2023-05-15 Thread Olivier FAURAX
Is this bug fixed in the linux-nvidia-5.19/5.19.0-1010.10 kernel?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1937133

Title:
  devlink_port_split from ubuntu_kernel_selftests.net fails on hirsute
  (KeyError: 'flavour')

Status in ubuntu-kernel-tests:
  In Progress
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Hirsute:
  Won't Fix
Status in linux source package in Jammy:
  Fix Committed
Status in linux source package in Kinetic:
  Fix Committed
Status in linux source package in Lunar:
  Fix Released

Bug description:
  [Impact]
  On s390x LPAR instances, this test will fail with:
  #   File "linux/tools/testing/selftests/net/devlink_port_split.py", line 
64, in get_if_names
  # if ports[port]['flavour'] == 'physical':
  # KeyError: 'flavour'
not ok 1 selftests: net: devlink_port_split.py # exit=1

  This is because the mlx4 driver in use on this instance does not set
  attributes, therefore `devlink -j port show` command output does not
  contain this "flavour" key.

  [Fix]
  * 3de66d08d3 selftests: net: devlink_port_split.py: skip test if no
suitable device available

  This patch can be cherry-picked into our J/K/L kernels.

  [Test]
  Run the patched devlink_port_split.py on s390x LPAR, and it won't
  fail with # KeyError: 'flavour' but marked as SKIP instead.

  [Where problems could occur]
  If this change is incorrect, it may affect the test result, however it's
  limited to testing tools, no actual impact to kernel functions.


  [Original Bug Report]
  Failing on hirsute/linux 5.11.0-26.28  host s2lp4

  Not a regression as this is also failing on 5.11.0-24.25

  17:16:32 DEBUG| [stdout] # selftests: net: devlink_port_split.py
  17:16:32 DEBUG| [stdout] # Traceback (most recent call last):
  17:16:32 DEBUG| [stdout] #   File 
"/home/ubuntu/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/net/./devlink_port_split.py",
 line 283, in 
  17:16:32 DEBUG| [stdout] # main()
  17:16:32 DEBUG| [stdout] #   File 
"/home/ubuntu/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/net/./devlink_port_split.py",
 line 256, in main
  17:16:32 DEBUG| [stdout] # ports = devlink_ports(dev)
  17:16:32 DEBUG| [stdout] #   File 
"/home/ubuntu/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/net/./devlink_port_split.py",
 line 70, in __init__
  17:16:32 DEBUG| [stdout] # self.if_names = devlink_ports.get_if_names(dev)
  17:16:32 DEBUG| [stdout] #   File 
"/home/ubuntu/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/net/./devlink_port_split.py",
 line 64, in get_if_names
  17:16:32 DEBUG| [stdout] # if ports[port]['flavour'] == 'physical':
  17:16:32 DEBUG| [stdout] # KeyError: 'flavour'
  17:16:32 DEBUG| [stdout] not ok 44 selftests: net: devlink_port_split.py # 
exit=1

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1937133/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2013603] Re: Kernel livepatch ftrace graph fix

2023-05-15 Thread Olivier FAURAX
Is this bug fixed in the linux/5.15.0-72.79 kernel?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2013603

Title:
  Kernel livepatch ftrace graph fix

Status in Ubuntu on IBM z Systems:
  Fix Released
Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Jammy:
  Fix Committed

Bug description:
  [Impact]
  * Additional patch required to support Livepatch for s390x
  * Fixes Livepatch transition issues when using ftrace graph tracing

  [Test Case]
  * Compile test
  * Boot test
  * Test a Livepatch (patch to /proc/meminfo module)
  * Test Livepatch from ftrace graphed function (via 
https://github.com/SUSE/qa_test_klp/, klp_tc_10.sh)

  [Where things could go wrong]
  * Functionality already exists upstream, once kernel is boot and Livepatch 
tested - should have no regressions

  [Other info]
  * Additional required patch was identified 
(https://github.com/dynup/kpatch/commit/324a43714b1227b5688e22966a5ee4414c8861d1)
 due to ftrace graph livepatch transition issue 
(https://github.com/SUSE/qa_test_klp/issues/17).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/2013603/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2013209] Re: expoline.o is packaged unconditionally for s390x

2023-05-10 Thread Olivier FAURAX
> While this works as expected on Jammy, ...

Do we need to test it for jammy, then?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-5.15 in Ubuntu.
https://bugs.launchpad.net/bugs/2013209

Title:
  expoline.o is packaged unconditionally for s390x

Status in linux package in Ubuntu:
  Fix Released
Status in linux-hwe-5.15 package in Ubuntu:
  New
Status in linux-hwe-5.15 source package in Focal:
  Fix Released
Status in linux source package in Jammy:
  Fix Committed
Status in linux source package in Kinetic:
  Fix Committed
Status in linux source package in Lunar:
  Fix Released

Bug description:
  https://bugs.launchpad.net/bugs/1639924 enabled CONFIG_EXPOLINE_EXTERN
  for s390x in Jammy. While this works as expected on Jammy, it won't
  work on some derivatives of it: for example focal:hwe-5.15. On Focal,
  this config can't be enabled due to the GCC version it comes with.
  CONFIG_EXPOLINE_EXTERN requires >= 110200 while Focal comes with
  90400.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2013209/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2004262] Re: Intel E810 NICs driver in causing hangs when booting and bonds configured

2023-04-27 Thread Olivier FAURAX
5.19.0-1010.10 works OK

root@m3-small-x86-01:~# uname -a
Linux m3-small-x86-01 5.19.0-1010-nvidia #10-Ubuntu SMP PREEMPT_DYNAMIC Tue Apr 
25 23:39:48 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
root@m3-small-x86-01:~# lspci|grep E810
01:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for 
SFP (rev 02)
01:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for 
SFP (rev 02)
root@m3-small-x86-01:~# ping 8.8.8.8 -c 4
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=119 time=0.706 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=119 time=0.774 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=119 time=0.729 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=119 time=0.703 ms

--- 8.8.8.8 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3076ms
rtt min/avg/max/mdev = 0.703/0.728/0.774/0.028 ms


** Tags removed: verification-needed-jammy
** Tags added: verification-done-jammy

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2004262

Title:
  Intel E810 NICs driver in causing hangs when booting and bonds
  configured

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Jammy:
  Fix Committed
Status in linux source package in Kinetic:
  Fix Committed
Status in linux source package in Lunar:
  Confirmed

Bug description:
  [Impact]
* Intel E810-family NICs cause system hangs when booting with bonding 
enabled
* This happens due to the driver unplugging auxiliary devices
* The unplug event happens under RTNL lock context, which causes a deadlock 
where the RDMA driver waits for the RNL lock to complete removal

  [Test Plan]
* Users have reported that after setting up bonding on switch and server 
side, the system will hang when starting network services

  [Fix]
* The upstream patch defers unplugging/re-plugging of the auxiliary device, 
so that it's not performed under the RTNL lock context.
* Fix was introduced by commit:
248401cb2c46 ice: avoid bonding causing auxiliary plug/unplug under 
RTNL lock

  [Regression Potential]
* Regressions would manifest in devices that support RDMA functionality and
  have been added to a bond
* We should look out for auxiliary devices that haven't been properly
  unplugged, or that cause further issues with
  ice_plug_aux_dev()/ice_unplug_aux_dev()

  
  [Original Description]
  jammy 22.04.1
  linux-image-generic 5.15.0-58-generic
  Intel E810-XXV Dual Port NICs in Dell PowerEdge 650

  - 5.15 in jammy -> reproducible
  - 5.19 in hwe-edge -> reproducible
  - 6.2.rc6 in the mainline build -> works
  - Intel's ice driver 1.10.1.2.2 -> works

  After beonding is enabled on switch and server side, the system will
  hang at initialing ubuntu.  The kernel loads but around starting the
  Network Services the system can hang for sometimes 5 minutes, and in
  other cases, indefinitely.

  The message of:

  echo 0 > /proc/sys/kernel/hung_task_timeout_sec”  systemd-resolve
  blocked for more than 120 seconds

  appears, and eventually the Network services just attempts to start
  and never does.  This is with or without DHCP enabled.

  Tried this same setup with the hwe-22.04, hwe-20.04, hwe-22.04-ege and
  linux-oem kernels and all exhibit the same failure.

  To work around this. installing the Intel 'ice' driver of version
  1.10.1.2.2 works.  The system doesn't even remotely hang at startup
  and all networking functions remain working (ping, DNS, general
  accessibility).

  The driver can be found at 
https://downloadmirror.intel.com/763930/ice-1.10.1.2.2.tar.gz
  ---
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Jan 31 13:08 seq
   crw-rw 1 root audio 116, 33 Jan 31 13:08 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu82.3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  CasperMD5json:
   {
     "result": "skip"
   }DistroRelease: Ubuntu 22.04
  InstallationDate: Installed on 2023-01-27 (3 days ago)InstallationMedia: 
Ubuntu-Server 22.04.1 LTS "Jammy Jellyfish" - Release amd64 (20220809)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  MachineType: Dell Inc. PowerEdge R650
  Package: linux (not installed)
  PciMultimedia:

  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-58-generic 
root=UUID=668aab7c-abe9-434b-a810-acc6eab76cbc ro fsck.mode=skip
  ProcVersionSignature: Ubuntu 5.15.0-58.64-generic 5.15.74
  RelatedPackageVersions:
   linux-restricted-modules-5.15.0-58-generic N/A
   linux-backports-modules-5.15.0-58-generic  N/A
   linux-firmware 20220329.git681281e4-0ubuntu3.9
  

[Kernel-packages] [Bug 2004262] Re: Intel E810 NICs driver in causing hangs when booting and bonds configured

2023-04-25 Thread Olivier FAURAX
Works for 5.19.0-42:

root@m3-small-x86-01:~# uname -a
Linux m3-small-x86-01 5.19.0-42-generic #43-Ubuntu SMP PREEMPT_DYNAMIC Tue Apr 
18 18:21:28 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
root@m3-small-x86-01:~# lspci|grep E810
01:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for 
SFP (rev 02)
01:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for 
SFP (rev 02)
root@m3-small-x86-01:~# ping 8.8.8.8 -c 4
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=119 time=0.712 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=119 time=0.690 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=119 time=0.706 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=119 time=0.767 ms

--- 8.8.8.8 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3054ms
rtt min/avg/max/mdev = 0.690/0.718/0.767/0.029 ms


** Tags removed: verification-needed-kinetic
** Tags added: verification-done-kinetic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2004262

Title:
  Intel E810 NICs driver in causing hangs when booting and bonds
  configured

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Jammy:
  Fix Committed
Status in linux source package in Kinetic:
  Fix Committed
Status in linux source package in Lunar:
  Confirmed

Bug description:
  [Impact]
* Intel E810-family NICs cause system hangs when booting with bonding 
enabled
* This happens due to the driver unplugging auxiliary devices
* The unplug event happens under RTNL lock context, which causes a deadlock 
where the RDMA driver waits for the RNL lock to complete removal

  [Test Plan]
* Users have reported that after setting up bonding on switch and server 
side, the system will hang when starting network services

  [Fix]
* The upstream patch defers unplugging/re-plugging of the auxiliary device, 
so that it's not performed under the RTNL lock context.
* Fix was introduced by commit:
248401cb2c46 ice: avoid bonding causing auxiliary plug/unplug under 
RTNL lock

  [Regression Potential]
* Regressions would manifest in devices that support RDMA functionality and
  have been added to a bond
* We should look out for auxiliary devices that haven't been properly
  unplugged, or that cause further issues with
  ice_plug_aux_dev()/ice_unplug_aux_dev()

  
  [Original Description]
  jammy 22.04.1
  linux-image-generic 5.15.0-58-generic
  Intel E810-XXV Dual Port NICs in Dell PowerEdge 650

  - 5.15 in jammy -> reproducible
  - 5.19 in hwe-edge -> reproducible
  - 6.2.rc6 in the mainline build -> works
  - Intel's ice driver 1.10.1.2.2 -> works

  After beonding is enabled on switch and server side, the system will
  hang at initialing ubuntu.  The kernel loads but around starting the
  Network Services the system can hang for sometimes 5 minutes, and in
  other cases, indefinitely.

  The message of:

  echo 0 > /proc/sys/kernel/hung_task_timeout_sec”  systemd-resolve
  blocked for more than 120 seconds

  appears, and eventually the Network services just attempts to start
  and never does.  This is with or without DHCP enabled.

  Tried this same setup with the hwe-22.04, hwe-20.04, hwe-22.04-ege and
  linux-oem kernels and all exhibit the same failure.

  To work around this. installing the Intel 'ice' driver of version
  1.10.1.2.2 works.  The system doesn't even remotely hang at startup
  and all networking functions remain working (ping, DNS, general
  accessibility).

  The driver can be found at 
https://downloadmirror.intel.com/763930/ice-1.10.1.2.2.tar.gz
  ---
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Jan 31 13:08 seq
   crw-rw 1 root audio 116, 33 Jan 31 13:08 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu82.3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  CasperMD5json:
   {
     "result": "skip"
   }DistroRelease: Ubuntu 22.04
  InstallationDate: Installed on 2023-01-27 (3 days ago)InstallationMedia: 
Ubuntu-Server 22.04.1 LTS "Jammy Jellyfish" - Release amd64 (20220809)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  MachineType: Dell Inc. PowerEdge R650
  Package: linux (not installed)
  PciMultimedia:

  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-58-generic 
root=UUID=668aab7c-abe9-434b-a810-acc6eab76cbc ro fsck.mode=skip
  ProcVersionSignature: Ubuntu 5.15.0-58.64-generic 5.15.74
  RelatedPackageVersions:
   linux-restricted-modules-5.15.0-58-generic N/A
   linux-backports-modules-5.15.0-58-generic  N/A
   linux-firmware 20220329.git681281e4-0ubuntu3.9
  

[Kernel-packages] [Bug 2004262] Re: Intel E810 NICs driver in causing hangs when booting and bonds configured

2023-04-25 Thread Olivier FAURAX
On current 23.04 (lunar):
* 6.2.0-20 doesn't work
* 6.2.0-21 doesn't work

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2004262

Title:
  Intel E810 NICs driver in causing hangs when booting and bonds
  configured

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Jammy:
  Fix Committed
Status in linux source package in Kinetic:
  Fix Committed
Status in linux source package in Lunar:
  Confirmed

Bug description:
  [Impact]
* Intel E810-family NICs cause system hangs when booting with bonding 
enabled
* This happens due to the driver unplugging auxiliary devices
* The unplug event happens under RTNL lock context, which causes a deadlock 
where the RDMA driver waits for the RNL lock to complete removal

  [Test Plan]
* Users have reported that after setting up bonding on switch and server 
side, the system will hang when starting network services

  [Fix]
* The upstream patch defers unplugging/re-plugging of the auxiliary device, 
so that it's not performed under the RTNL lock context.
* Fix was introduced by commit:
248401cb2c46 ice: avoid bonding causing auxiliary plug/unplug under 
RTNL lock

  [Regression Potential]
* Regressions would manifest in devices that support RDMA functionality and
  have been added to a bond
* We should look out for auxiliary devices that haven't been properly
  unplugged, or that cause further issues with
  ice_plug_aux_dev()/ice_unplug_aux_dev()

  
  [Original Description]
  jammy 22.04.1
  linux-image-generic 5.15.0-58-generic
  Intel E810-XXV Dual Port NICs in Dell PowerEdge 650

  - 5.15 in jammy -> reproducible
  - 5.19 in hwe-edge -> reproducible
  - 6.2.rc6 in the mainline build -> works
  - Intel's ice driver 1.10.1.2.2 -> works

  After beonding is enabled on switch and server side, the system will
  hang at initialing ubuntu.  The kernel loads but around starting the
  Network Services the system can hang for sometimes 5 minutes, and in
  other cases, indefinitely.

  The message of:

  echo 0 > /proc/sys/kernel/hung_task_timeout_sec”  systemd-resolve
  blocked for more than 120 seconds

  appears, and eventually the Network services just attempts to start
  and never does.  This is with or without DHCP enabled.

  Tried this same setup with the hwe-22.04, hwe-20.04, hwe-22.04-ege and
  linux-oem kernels and all exhibit the same failure.

  To work around this. installing the Intel 'ice' driver of version
  1.10.1.2.2 works.  The system doesn't even remotely hang at startup
  and all networking functions remain working (ping, DNS, general
  accessibility).

  The driver can be found at 
https://downloadmirror.intel.com/763930/ice-1.10.1.2.2.tar.gz
  ---
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Jan 31 13:08 seq
   crw-rw 1 root audio 116, 33 Jan 31 13:08 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu82.3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  CasperMD5json:
   {
     "result": "skip"
   }DistroRelease: Ubuntu 22.04
  InstallationDate: Installed on 2023-01-27 (3 days ago)InstallationMedia: 
Ubuntu-Server 22.04.1 LTS "Jammy Jellyfish" - Release amd64 (20220809)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  MachineType: Dell Inc. PowerEdge R650
  Package: linux (not installed)
  PciMultimedia:

  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-58-generic 
root=UUID=668aab7c-abe9-434b-a810-acc6eab76cbc ro fsck.mode=skip
  ProcVersionSignature: Ubuntu 5.15.0-58.64-generic 5.15.74
  RelatedPackageVersions:
   linux-restricted-modules-5.15.0-58-generic N/A
   linux-backports-modules-5.15.0-58-generic  N/A
   linux-firmware 20220329.git681281e4-0ubuntu3.9
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'Tags:  jammy 
uec-images
  Uname: Linux 5.15.0-58-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 09/14/2022
  dmi.bios.release: 1.8
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.8.2
  dmi.board.name: 0PJ7YJ
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A01
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.8.2:bd09/14/2022:br1.8:svnDellInc.:pnPowerEdgeR650:pvr:rvnDellInc.:rn0PJ7YJ:rvrA01:cvnDellInc.:ct23:cvr:skuSKU=0912;ModelName=PowerEdgeR650:
  dmi.product.family: PowerEdge
  dmi.product.name: PowerEdge R650
  dmi.product.sku: SKU=0912;ModelName=PowerEdge R650
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:

[Kernel-packages] [Bug 2004262] Re: Intel E810 NICs driver in causing hangs when booting and bonds configured

2023-04-24 Thread Olivier FAURAX
** Tags removed: verification-needed-jammy
** Tags added: verification-done-jammy

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2004262

Title:
  Intel E810 NICs driver in causing hangs when booting and bonds
  configured

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Jammy:
  Fix Committed
Status in linux source package in Kinetic:
  Fix Committed
Status in linux source package in Lunar:
  Confirmed

Bug description:
  [Impact]
* Intel E810-family NICs cause system hangs when booting with bonding 
enabled
* This happens due to the driver unplugging auxiliary devices
* The unplug event happens under RTNL lock context, which causes a deadlock 
where the RDMA driver waits for the RNL lock to complete removal

  [Test Plan]
* Users have reported that after setting up bonding on switch and server 
side, the system will hang when starting network services

  [Fix]
* The upstream patch defers unplugging/re-plugging of the auxiliary device, 
so that it's not performed under the RTNL lock context.
* Fix was introduced by commit:
248401cb2c46 ice: avoid bonding causing auxiliary plug/unplug under 
RTNL lock

  [Regression Potential]
* Regressions would manifest in devices that support RDMA functionality and
  have been added to a bond
* We should look out for auxiliary devices that haven't been properly
  unplugged, or that cause further issues with
  ice_plug_aux_dev()/ice_unplug_aux_dev()

  
  [Original Description]
  jammy 22.04.1
  linux-image-generic 5.15.0-58-generic
  Intel E810-XXV Dual Port NICs in Dell PowerEdge 650

  - 5.15 in jammy -> reproducible
  - 5.19 in hwe-edge -> reproducible
  - 6.2.rc6 in the mainline build -> works
  - Intel's ice driver 1.10.1.2.2 -> works

  After beonding is enabled on switch and server side, the system will
  hang at initialing ubuntu.  The kernel loads but around starting the
  Network Services the system can hang for sometimes 5 minutes, and in
  other cases, indefinitely.

  The message of:

  echo 0 > /proc/sys/kernel/hung_task_timeout_sec”  systemd-resolve
  blocked for more than 120 seconds

  appears, and eventually the Network services just attempts to start
  and never does.  This is with or without DHCP enabled.

  Tried this same setup with the hwe-22.04, hwe-20.04, hwe-22.04-ege and
  linux-oem kernels and all exhibit the same failure.

  To work around this. installing the Intel 'ice' driver of version
  1.10.1.2.2 works.  The system doesn't even remotely hang at startup
  and all networking functions remain working (ping, DNS, general
  accessibility).

  The driver can be found at 
https://downloadmirror.intel.com/763930/ice-1.10.1.2.2.tar.gz
  ---
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Jan 31 13:08 seq
   crw-rw 1 root audio 116, 33 Jan 31 13:08 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu82.3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  CasperMD5json:
   {
     "result": "skip"
   }DistroRelease: Ubuntu 22.04
  InstallationDate: Installed on 2023-01-27 (3 days ago)InstallationMedia: 
Ubuntu-Server 22.04.1 LTS "Jammy Jellyfish" - Release amd64 (20220809)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  MachineType: Dell Inc. PowerEdge R650
  Package: linux (not installed)
  PciMultimedia:

  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-58-generic 
root=UUID=668aab7c-abe9-434b-a810-acc6eab76cbc ro fsck.mode=skip
  ProcVersionSignature: Ubuntu 5.15.0-58.64-generic 5.15.74
  RelatedPackageVersions:
   linux-restricted-modules-5.15.0-58-generic N/A
   linux-backports-modules-5.15.0-58-generic  N/A
   linux-firmware 20220329.git681281e4-0ubuntu3.9
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'Tags:  jammy 
uec-images
  Uname: Linux 5.15.0-58-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 09/14/2022
  dmi.bios.release: 1.8
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.8.2
  dmi.board.name: 0PJ7YJ
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A01
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.8.2:bd09/14/2022:br1.8:svnDellInc.:pnPowerEdgeR650:pvr:rvnDellInc.:rn0PJ7YJ:rvrA01:cvnDellInc.:ct23:cvr:skuSKU=0912;ModelName=PowerEdgeR650:
  dmi.product.family: PowerEdge
  dmi.product.name: PowerEdge R650
  dmi.product.sku: SKU=0912;ModelName=PowerEdge R650
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:

[Kernel-packages] [Bug 2004262] Re: Intel E810 NICs driver in causing hangs when booting and bonds configured

2023-04-24 Thread Olivier FAURAX
5.15.0-72 boots OK on affected hardware.

root@m3-small-x86-01:~# lspci|grep E810
01:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for 
SFP (rev 02)
01:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for 
SFP (rev 02)
root@m3-small-x86-01:~# uname -a
Linux m3-small-x86-01 5.15.0-72-generic #79-Ubuntu SMP Wed Apr 19 08:22:18 UTC 
2023 x86_64 x86_64 x86_64 GNU/Linux
root@m3-small-x86-01:~# ping 8.8.8.8 -c 4
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=119 time=0.640 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=119 time=0.708 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=119 time=0.736 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=119 time=0.734 ms

--- 8.8.8.8 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3062ms
rtt min/avg/max/mdev = 0.640/0.704/0.736/0.038 ms

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2004262

Title:
  Intel E810 NICs driver in causing hangs when booting and bonds
  configured

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Jammy:
  Fix Committed
Status in linux source package in Kinetic:
  Fix Committed
Status in linux source package in Lunar:
  Confirmed

Bug description:
  [Impact]
* Intel E810-family NICs cause system hangs when booting with bonding 
enabled
* This happens due to the driver unplugging auxiliary devices
* The unplug event happens under RTNL lock context, which causes a deadlock 
where the RDMA driver waits for the RNL lock to complete removal

  [Test Plan]
* Users have reported that after setting up bonding on switch and server 
side, the system will hang when starting network services

  [Fix]
* The upstream patch defers unplugging/re-plugging of the auxiliary device, 
so that it's not performed under the RTNL lock context.
* Fix was introduced by commit:
248401cb2c46 ice: avoid bonding causing auxiliary plug/unplug under 
RTNL lock

  [Regression Potential]
* Regressions would manifest in devices that support RDMA functionality and
  have been added to a bond
* We should look out for auxiliary devices that haven't been properly
  unplugged, or that cause further issues with
  ice_plug_aux_dev()/ice_unplug_aux_dev()

  
  [Original Description]
  jammy 22.04.1
  linux-image-generic 5.15.0-58-generic
  Intel E810-XXV Dual Port NICs in Dell PowerEdge 650

  - 5.15 in jammy -> reproducible
  - 5.19 in hwe-edge -> reproducible
  - 6.2.rc6 in the mainline build -> works
  - Intel's ice driver 1.10.1.2.2 -> works

  After beonding is enabled on switch and server side, the system will
  hang at initialing ubuntu.  The kernel loads but around starting the
  Network Services the system can hang for sometimes 5 minutes, and in
  other cases, indefinitely.

  The message of:

  echo 0 > /proc/sys/kernel/hung_task_timeout_sec”  systemd-resolve
  blocked for more than 120 seconds

  appears, and eventually the Network services just attempts to start
  and never does.  This is with or without DHCP enabled.

  Tried this same setup with the hwe-22.04, hwe-20.04, hwe-22.04-ege and
  linux-oem kernels and all exhibit the same failure.

  To work around this. installing the Intel 'ice' driver of version
  1.10.1.2.2 works.  The system doesn't even remotely hang at startup
  and all networking functions remain working (ping, DNS, general
  accessibility).

  The driver can be found at 
https://downloadmirror.intel.com/763930/ice-1.10.1.2.2.tar.gz
  ---
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Jan 31 13:08 seq
   crw-rw 1 root audio 116, 33 Jan 31 13:08 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu82.3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  CasperMD5json:
   {
     "result": "skip"
   }DistroRelease: Ubuntu 22.04
  InstallationDate: Installed on 2023-01-27 (3 days ago)InstallationMedia: 
Ubuntu-Server 22.04.1 LTS "Jammy Jellyfish" - Release amd64 (20220809)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  MachineType: Dell Inc. PowerEdge R650
  Package: linux (not installed)
  PciMultimedia:

  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-58-generic 
root=UUID=668aab7c-abe9-434b-a810-acc6eab76cbc ro fsck.mode=skip
  ProcVersionSignature: Ubuntu 5.15.0-58.64-generic 5.15.74
  RelatedPackageVersions:
   linux-restricted-modules-5.15.0-58-generic N/A
   linux-backports-modules-5.15.0-58-generic  N/A
   linux-firmware 20220329.git681281e4-0ubuntu3.9
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'Tags:  jammy