Public bug reported:

The ice driver in the 5.15.0-69 kernel deadlocks on rtnl_lock() when
adding e810 NICs to a bond interface.  Booting with
`sysctl.hung_task_panic=1` and `sysctl.hung_task_all_cpu_backtrace=1`
added to the kernel command-line shows (among lots of other output):

```
[  244.980100] INFO: task kworker/6:1:182 blocked for more than 120 seconds.
[  244.988431]       Not tainted 5.15.0-69-generic #76-Ubuntu
[  244.995279] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  245.004826] task:kworker/6:1     state:D stack:    0 pid:  182 ppid:     2 
flags:0x00004000
[  245.015017] Workqueue: events linkwatch_event
[  245.020734] Call Trace:
[  245.024144]  <TASK>
[  245.027137]  __schedule+0x24e/0x590
[  245.031848]  schedule+0x69/0x110
[  245.036228]  schedule_preempt_disabled+0xe/0x20
[  245.042066]  __mutex_lock.constprop.0+0x267/0x490
[  245.047993]  __mutex_lock_slowpath+0x13/0x20
[  245.053432]  mutex_lock+0x38/0x50
[  245.057714]  rtnl_lock+0x15/0x20
[  245.061901]  linkwatch_event+0xe/0x30
[  245.066571]  process_one_work+0x228/0x3d0
[  245.071607]  worker_thread+0x53/0x420
[  245.076260]  ? process_one_work+0x3d0/0x3d0
[  245.081493]  kthread+0x127/0x150
[  245.085592]  ? set_kthread_struct+0x50/0x50
[  245.090769]  ret_from_fork+0x1f/0x30
[  245.095266]  </TASK>
```

and

```
[  245.530629] INFO: task ifenslave:849 blocked for more than 121 seconds.
[  245.540433]       Not tainted 5.15.0-69-generic #76-Ubuntu
[  245.549050] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  245.558960] task:ifenslave       state:D stack:    0 pid:  849 ppid:   847 
flags:0x00004002
[  245.570930] Call Trace:
[  245.576175]  <TASK>
[  245.581018]  __schedule+0x24e/0x590
[  245.587445]  schedule+0x69/0x110
[  245.593631]  schedule_timeout+0x103/0x140
[  245.600573]  __wait_for_common+0xab/0x150
[  245.607526]  ? usleep_range_state+0x90/0x90
[  245.614743]  wait_for_completion+0x24/0x30
[  245.621903]  flush_workqueue+0x133/0x3e0
[  245.628887]  ib_cache_cleanup_one+0x21/0xf0 [ib_core]
[  245.637083]  __ib_unregister_device+0x79/0xc0 [ib_core]
[  245.645398]  ib_unregister_device+0x27/0x40 [ib_core]
[  245.653541]  irdma_ib_unregister_device+0x4b/0x70 [irdma]
[  245.662105]  irdma_remove+0x1f/0x70 [irdma]
[  245.669446]  auxiliary_bus_remove+0x1d/0x40
[  245.676688]  __device_release_driver+0x1a8/0x2a0
[  245.684241]  device_release_driver+0x29/0x40
[  245.691416]  bus_remove_device+0xde/0x150
[  245.698396]  device_del+0x19c/0x400
[    **712178]  ice_lag_link.isra.0+0xdd/0xf0 [ice]
m] (3 of 5) A start job is runni[  245.720683]  
ice_lag_changeupper_event+0xe1/0x130 [ice]
ng for\u2026rk interfaces (3min 47s[  245.729739]  
ice_lag_event_handler+0x5b/0x150 [ice]
 / 5min 3s)
[  245.738525]  raw_notifier_call_chain+0x46/0x60
[  245.746006]  call_netdevice_notifiers_info+0x52/0xa0
[  245.754123]  __netdev_upper_dev_link+0x1b7/0x310
[  245.761658]  netdev_master_upper_dev_link+0x3e/0x60
[  245.769627]  bond_enslave+0xc3a/0x1720 [bonding]
[  245.777398]  ? sscanf+0x4e/0x70
[  245.783375]  bond_option_slaves_set+0xca/0x170 [bonding]
[  245.791738]  __bond_opt_set+0xbd/0x1a0 [bonding]
[  245.799505]  __bond_opt_set_notify+0x30/0xb0 [bonding]
[  245.807860]  bond_opt_tryset_rtnl+0x56/0xa0 [bonding]
[  245.816062]  bonding_sysfs_store_option+0x52/0xa0 [bonding]
[  245.824750]  dev_attr_store+0x14/0x30
[  245.831443]  sysfs_kf_write+0x3b/0x50
[  245.837979]  kernfs_fop_write_iter+0x138/0x1c0
[  245.845469]  new_sync_write+0x111/0x1a0
[  245.852210]  vfs_write+0x1d5/0x270
[  245.858429]  ksys_write+0x67/0xf0
[  245.864624]  __x64_sys_write+0x19/0x20
[  245.871288]  do_syscall_64+0x59/0xc0
[  245.877715]  ? handle_mm_fault+0xd8/0x2c0
[  245.884566]  ? do_user_addr_fault+0x1e7/0x670
[  245.891990]  ? filp_close+0x60/0x70
[  245.898452]  ? exit_to_user_mode_prepare+0x37/0xb0
[  245.906272]  ? irqentry_exit_to_user_mode+0x9/0x20
[  245.914042]  ? irqentry_exit+0x1d/0x30
[  245.920703]  ? exc_page_fault+0x89/0x170
[  245.927555]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[  245.935763] RIP: 0033:0x7f1e86855a37
[  245.942153] RSP: 002b:00007fff8da477a8 EFLAGS: 00000246 ORIG_RAX: 
0000000000000001
[  245.953034] RAX: ffffffffffffffda RBX: 000000000000000a RCX: 00007f1e86855a37
[  245.963554] RDX: 000000000000000a RSI: 0000556eff580510 RDI: 0000000000000001
[  245.972468] RBP: 0000556eff580510 R08: 0000556eff582c5a R09: 0000000000000000
[  245.983048] R10: 0000556eff582c59 R11: 0000000000000246 R12: 0000000000000001
[  245.993402] R13: 000000000000000a R14: 0000000000000000 R15: 0000000000000000
[  246.001700]  </TASK>
```

This appears consistent with the underlying cause being the bug fixed by
mainline commit 248401cb2c4612d83eb0c352ee8103b78b8eb365 (commit
87b9ac7bd301f53b122224fc8eddb1f4045e3f2c in the 5.15.y stable tree).

The 5.15.0-67 kernel does not exhibit the problem; given that the
5.15.0-68 kernel apparently included the "RDMA/irdma: Report the correct
link speed" patch listed in one of the "Fixes" tags in the above commit,
I suspect that that's the culprit and that importing the above commit
shoudl resolve the problem.

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-image-5.15.0-67-generic 5.15.0-67.74
ProcVersionSignature: Ubuntu 5.15.0-67.74-generic 5.15.85
Uname: Linux 5.15.0-67-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116,  1 Apr  5 22:47 seq
 crw-rw---- 1 root audio 116, 33 Apr  5 22:47 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu82.3
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
CasperMD5CheckResult: unknown
Date: Wed Apr  5 22:48:03 2023
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 004: ID 0b1f:03ee Insyde Software Corp. RNDIS/Ethernet Gadget
 Bus 001 Device 003: ID 0557:9241 ATEN International Co., Ltd SMCI HID KM
 Bus 001 Device 002: ID 1d6b:0107 Linux Foundation USB Virtual Hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Supermicro SYS-510T-MR-EI018
PciMultimedia:
 
ProcEnviron:
 TERM=vt220
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=C.UTF-8
 SHELL=/bin/bash
ProcFB: 0 astdrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-67-generic 
root=UUID=0b21ae48-6315-4193-8c24-fc224a18170f ro console=tty0 
console=ttyS1,115200n8 modprobe.blacklist=igb modprobe.blacklist=rndis_host
RelatedPackageVersions:
 linux-restricted-modules-5.15.0-67-generic N/A
 linux-backports-modules-5.15.0-67-generic  N/A
 linux-firmware                             20220329.git681281e4-0ubuntu3.9
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 06/23/2022
dmi.bios.release: 5.22
dmi.bios.vendor: American Megatrends International, LLC.
dmi.bios.version: 1.2
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: X12STH-SYS
dmi.board.vendor: Supermicro
dmi.board.version: 1.01
dmi.chassis.asset.tag: To be filled by O.E.M.
dmi.chassis.type: 1
dmi.chassis.vendor: Supermicro
dmi.chassis.version: 0123456789
dmi.modalias: 
dmi:bvnAmericanMegatrendsInternational,LLC.:bvr1.2:bd06/23/2022:br5.22:svnSupermicro:pnSYS-510T-MR-EI018:pvr0123456789:rvnSupermicro:rnX12STH-SYS:rvr1.01:cvnSupermicro:ct1:cvr0123456789:skuTobefilledbyO.E.M.:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: SYS-510T-MR-EI018
dmi.product.sku: To be filled by O.E.M.
dmi.product.version: 0123456789
dmi.sys.vendor: Supermicro

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug jammy

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2015414

Title:
  5.15.0-69 ice driver deadlocks with bonded e810 NICs

Status in linux package in Ubuntu:
  New

Bug description:
  The ice driver in the 5.15.0-69 kernel deadlocks on rtnl_lock() when
  adding e810 NICs to a bond interface.  Booting with
  `sysctl.hung_task_panic=1` and `sysctl.hung_task_all_cpu_backtrace=1`
  added to the kernel command-line shows (among lots of other output):

  ```
  [  244.980100] INFO: task kworker/6:1:182 blocked for more than 120 seconds.
  [  244.988431]       Not tainted 5.15.0-69-generic #76-Ubuntu
  [  244.995279] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  245.004826] task:kworker/6:1     state:D stack:    0 pid:  182 ppid:     2 
flags:0x00004000
  [  245.015017] Workqueue: events linkwatch_event
  [  245.020734] Call Trace:
  [  245.024144]  <TASK>
  [  245.027137]  __schedule+0x24e/0x590
  [  245.031848]  schedule+0x69/0x110
  [  245.036228]  schedule_preempt_disabled+0xe/0x20
  [  245.042066]  __mutex_lock.constprop.0+0x267/0x490
  [  245.047993]  __mutex_lock_slowpath+0x13/0x20
  [  245.053432]  mutex_lock+0x38/0x50
  [  245.057714]  rtnl_lock+0x15/0x20
  [  245.061901]  linkwatch_event+0xe/0x30
  [  245.066571]  process_one_work+0x228/0x3d0
  [  245.071607]  worker_thread+0x53/0x420
  [  245.076260]  ? process_one_work+0x3d0/0x3d0
  [  245.081493]  kthread+0x127/0x150
  [  245.085592]  ? set_kthread_struct+0x50/0x50
  [  245.090769]  ret_from_fork+0x1f/0x30
  [  245.095266]  </TASK>
  ```

  and

  ```
  [  245.530629] INFO: task ifenslave:849 blocked for more than 121 seconds.
  [  245.540433]       Not tainted 5.15.0-69-generic #76-Ubuntu
  [  245.549050] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  245.558960] task:ifenslave       state:D stack:    0 pid:  849 ppid:   847 
flags:0x00004002
  [  245.570930] Call Trace:
  [  245.576175]  <TASK>
  [  245.581018]  __schedule+0x24e/0x590
  [  245.587445]  schedule+0x69/0x110
  [  245.593631]  schedule_timeout+0x103/0x140
  [  245.600573]  __wait_for_common+0xab/0x150
  [  245.607526]  ? usleep_range_state+0x90/0x90
  [  245.614743]  wait_for_completion+0x24/0x30
  [  245.621903]  flush_workqueue+0x133/0x3e0
  [  245.628887]  ib_cache_cleanup_one+0x21/0xf0 [ib_core]
  [  245.637083]  __ib_unregister_device+0x79/0xc0 [ib_core]
  [  245.645398]  ib_unregister_device+0x27/0x40 [ib_core]
  [  245.653541]  irdma_ib_unregister_device+0x4b/0x70 [irdma]
  [  245.662105]  irdma_remove+0x1f/0x70 [irdma]
  [  245.669446]  auxiliary_bus_remove+0x1d/0x40
  [  245.676688]  __device_release_driver+0x1a8/0x2a0
  [  245.684241]  device_release_driver+0x29/0x40
  [  245.691416]  bus_remove_device+0xde/0x150
  [  245.698396]  device_del+0x19c/0x400
  [    **712178]  ice_lag_link.isra.0+0xdd/0xf0 [ice]
  m] (3 of 5) A start job is runni[  245.720683]  
ice_lag_changeupper_event+0xe1/0x130 [ice]
  ng for\u2026rk interfaces (3min 47s[  245.729739]  
ice_lag_event_handler+0x5b/0x150 [ice]
   / 5min 3s)
  [  245.738525]  raw_notifier_call_chain+0x46/0x60
  [  245.746006]  call_netdevice_notifiers_info+0x52/0xa0
  [  245.754123]  __netdev_upper_dev_link+0x1b7/0x310
  [  245.761658]  netdev_master_upper_dev_link+0x3e/0x60
  [  245.769627]  bond_enslave+0xc3a/0x1720 [bonding]
  [  245.777398]  ? sscanf+0x4e/0x70
  [  245.783375]  bond_option_slaves_set+0xca/0x170 [bonding]
  [  245.791738]  __bond_opt_set+0xbd/0x1a0 [bonding]
  [  245.799505]  __bond_opt_set_notify+0x30/0xb0 [bonding]
  [  245.807860]  bond_opt_tryset_rtnl+0x56/0xa0 [bonding]
  [  245.816062]  bonding_sysfs_store_option+0x52/0xa0 [bonding]
  [  245.824750]  dev_attr_store+0x14/0x30
  [  245.831443]  sysfs_kf_write+0x3b/0x50
  [  245.837979]  kernfs_fop_write_iter+0x138/0x1c0
  [  245.845469]  new_sync_write+0x111/0x1a0
  [  245.852210]  vfs_write+0x1d5/0x270
  [  245.858429]  ksys_write+0x67/0xf0
  [  245.864624]  __x64_sys_write+0x19/0x20
  [  245.871288]  do_syscall_64+0x59/0xc0
  [  245.877715]  ? handle_mm_fault+0xd8/0x2c0
  [  245.884566]  ? do_user_addr_fault+0x1e7/0x670
  [  245.891990]  ? filp_close+0x60/0x70
  [  245.898452]  ? exit_to_user_mode_prepare+0x37/0xb0
  [  245.906272]  ? irqentry_exit_to_user_mode+0x9/0x20
  [  245.914042]  ? irqentry_exit+0x1d/0x30
  [  245.920703]  ? exc_page_fault+0x89/0x170
  [  245.927555]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
  [  245.935763] RIP: 0033:0x7f1e86855a37
  [  245.942153] RSP: 002b:00007fff8da477a8 EFLAGS: 00000246 ORIG_RAX: 
0000000000000001
  [  245.953034] RAX: ffffffffffffffda RBX: 000000000000000a RCX: 
00007f1e86855a37
  [  245.963554] RDX: 000000000000000a RSI: 0000556eff580510 RDI: 
0000000000000001
  [  245.972468] RBP: 0000556eff580510 R08: 0000556eff582c5a R09: 
0000000000000000
  [  245.983048] R10: 0000556eff582c59 R11: 0000000000000246 R12: 
0000000000000001
  [  245.993402] R13: 000000000000000a R14: 0000000000000000 R15: 
0000000000000000
  [  246.001700]  </TASK>
  ```

  This appears consistent with the underlying cause being the bug fixed
  by mainline commit 248401cb2c4612d83eb0c352ee8103b78b8eb365 (commit
  87b9ac7bd301f53b122224fc8eddb1f4045e3f2c in the 5.15.y stable tree).

  The 5.15.0-67 kernel does not exhibit the problem; given that the
  5.15.0-68 kernel apparently included the "RDMA/irdma: Report the
  correct link speed" patch listed in one of the "Fixes" tags in the
  above commit, I suspect that that's the culprit and that importing the
  above commit shoudl resolve the problem.

  ProblemType: Bug
  DistroRelease: Ubuntu 22.04
  Package: linux-image-5.15.0-67-generic 5.15.0-67.74
  ProcVersionSignature: Ubuntu 5.15.0-67.74-generic 5.15.85
  Uname: Linux 5.15.0-67-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Apr  5 22:47 seq
   crw-rw---- 1 root audio 116, 33 Apr  5 22:47 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu82.3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  CasperMD5CheckResult: unknown
  Date: Wed Apr  5 22:48:03 2023
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 004: ID 0b1f:03ee Insyde Software Corp. RNDIS/Ethernet Gadget
   Bus 001 Device 003: ID 0557:9241 ATEN International Co., Ltd SMCI HID KM
   Bus 001 Device 002: ID 1d6b:0107 Linux Foundation USB Virtual Hub
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Supermicro SYS-510T-MR-EI018
  PciMultimedia:
   
  ProcEnviron:
   TERM=vt220
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 astdrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-67-generic 
root=UUID=0b21ae48-6315-4193-8c24-fc224a18170f ro console=tty0 
console=ttyS1,115200n8 modprobe.blacklist=igb modprobe.blacklist=rndis_host
  RelatedPackageVersions:
   linux-restricted-modules-5.15.0-67-generic N/A
   linux-backports-modules-5.15.0-67-generic  N/A
   linux-firmware                             20220329.git681281e4-0ubuntu3.9
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/23/2022
  dmi.bios.release: 5.22
  dmi.bios.vendor: American Megatrends International, LLC.
  dmi.bios.version: 1.2
  dmi.board.asset.tag: To be filled by O.E.M.
  dmi.board.name: X12STH-SYS
  dmi.board.vendor: Supermicro
  dmi.board.version: 1.01
  dmi.chassis.asset.tag: To be filled by O.E.M.
  dmi.chassis.type: 1
  dmi.chassis.vendor: Supermicro
  dmi.chassis.version: 0123456789
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInternational,LLC.:bvr1.2:bd06/23/2022:br5.22:svnSupermicro:pnSYS-510T-MR-EI018:pvr0123456789:rvnSupermicro:rnX12STH-SYS:rvr1.01:cvnSupermicro:ct1:cvr0123456789:skuTobefilledbyO.E.M.:
  dmi.product.family: To be filled by O.E.M.
  dmi.product.name: SYS-510T-MR-EI018
  dmi.product.sku: To be filled by O.E.M.
  dmi.product.version: 0123456789
  dmi.sys.vendor: Supermicro

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2015414/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to