This bug is awaiting verification that the linux-azure/5.15.0-1040.47
kernel in -proposed solves the problem. Please test the kernel and
update this bug with the results. If the problem is solved, change the
tag 'verification-needed-jammy' to 'verification-done-jammy'. If the
problem still exists, change the tag 'verification-needed-jammy' to
'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-azure

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2004262

Title:
  Intel E810 NICs driver in causing hangs when booting and bonds
  configured

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Jammy:
  Fix Released
Status in linux source package in Kinetic:
  Fix Released
Status in linux source package in Lunar:
  Confirmed

Bug description:
  [Impact]
    * Intel E810-family NICs cause system hangs when booting with bonding 
enabled
    * This happens due to the driver unplugging auxiliary devices
    * The unplug event happens under RTNL lock context, which causes a deadlock 
where the RDMA driver waits for the RNL lock to complete removal

  [Test Plan]
    * Users have reported that after setting up bonding on switch and server 
side, the system will hang when starting network services

  [Fix]
    * The upstream patch defers unplugging/re-plugging of the auxiliary device, 
so that it's not performed under the RTNL lock context.
    * Fix was introduced by commit:
        248401cb2c46 ice: avoid bonding causing auxiliary plug/unplug under 
RTNL lock

  [Regression Potential]
    * Regressions would manifest in devices that support RDMA functionality and
      have been added to a bond
    * We should look out for auxiliary devices that haven't been properly
      unplugged, or that cause further issues with
      ice_plug_aux_dev()/ice_unplug_aux_dev()

  
  [Original Description]
  jammy 22.04.1
  linux-image-generic 5.15.0-58-generic
  Intel E810-XXV Dual Port NICs in Dell PowerEdge 650

  - 5.15 in jammy -> reproducible
  - 5.19 in hwe-edge -> reproducible
  - 6.2.rc6 in the mainline build -> works
  - Intel's ice driver 1.10.1.2.2 -> works

  After beonding is enabled on switch and server side, the system will
  hang at initialing ubuntu.  The kernel loads but around starting the
  Network Services the system can hang for sometimes 5 minutes, and in
  other cases, indefinitely.

  The message of:

  echo 0 > /proc/sys/kernel/hung_task_timeout_sec”  systemd-resolve
  blocked for more than 120 seconds

  appears, and eventually the Network services just attempts to start
  and never does.  This is with or without DHCP enabled.

  Tried this same setup with the hwe-22.04, hwe-20.04, hwe-22.04-ege and
  linux-oem kernels and all exhibit the same failure.

  To work around this. installing the Intel 'ice' driver of version
  1.10.1.2.2 works.  The system doesn't even remotely hang at startup
  and all networking functions remain working (ping, DNS, general
  accessibility).

  The driver can be found at 
https://downloadmirror.intel.com/763930/ice-1.10.1.2.2.tar.gz
  ---
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Jan 31 13:08 seq
   crw-rw---- 1 root audio 116, 33 Jan 31 13:08 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu82.3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  CasperMD5json:
   {
     "result": "skip"
   }DistroRelease: Ubuntu 22.04
  InstallationDate: Installed on 2023-01-27 (3 days ago)InstallationMedia: 
Ubuntu-Server 22.04.1 LTS "Jammy Jellyfish" - Release amd64 (20220809)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  MachineType: Dell Inc. PowerEdge R650
  Package: linux (not installed)
  PciMultimedia:

  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-58-generic 
root=UUID=668aab7c-abe9-434b-a810-acc6eab76cbc ro fsck.mode=skip
  ProcVersionSignature: Ubuntu 5.15.0-58.64-generic 5.15.74
  RelatedPackageVersions:
   linux-restricted-modules-5.15.0-58-generic N/A
   linux-backports-modules-5.15.0-58-generic  N/A
   linux-firmware                             20220329.git681281e4-0ubuntu3.9
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'Tags:  jammy 
uec-images
  Uname: Linux 5.15.0-58-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 09/14/2022
  dmi.bios.release: 1.8
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.8.2
  dmi.board.name: 0PJ7YJ
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A01
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.8.2:bd09/14/2022:br1.8:svnDellInc.:pnPowerEdgeR650:pvr:rvnDellInc.:rn0PJ7YJ:rvrA01:cvnDellInc.:ct23:cvr:skuSKU=0912;ModelName=PowerEdgeR650:
  dmi.product.family: PowerEdge
  dmi.product.name: PowerEdge R650
  dmi.product.sku: SKU=0912;ModelName=PowerEdge R650
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2004262/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to