[Kernel-packages] [Bug 2042455] Re: Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init

2023-11-06 Thread Feysel Mohammed
** Tags removed: verification-needed-jammy-linux-bluefield
** Tags added: verification-done-jammy-linux-bluefield

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2042455

Title:
  Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init

Status in linux-bluefield package in Ubuntu:
  Invalid
Status in linux-bluefield source package in Jammy:
  Fix Committed

Bug description:
  Summary:
  Machine hangs when loading OFED 2310 mlx5 driver at BlueField

  How to reproduce:
  # load the OFED driver

  Reason:
  BF got stuck and observed call trace "mlx5_sf_hw_table_init+0xf4/0x2d0 
[mlx5_core]

  dmesg from minicom:
  [  726.569928] INFO: task systemd-udevd:297 blocked for more than 604 seconds.
  [  726.576895]   Tainted: G   OE 5.15.0-1029-bluefield 
#31-Ubuntu
  [  726.584101] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  726.591913] task:systemd-udevd   state:D stack:0 pid:  297 ppid:   280 
flags:0x000d
  [  726.600248] Call trace:
  [  726.602680]  __switch_to+0xf8/0x150
  [  726.606159]  __schedule+0x2b8/0x790
  [  726.609634]  schedule+0x64/0x140
  [  726.612850]  schedule_preempt_disabled+0x18/0x24
  [  726.617453]  __mutex_lock.constprop.0+0x1a0/0x680
  [  726.622141]  __mutex_lock_slowpath+0x40/0x90
  [  726.626396]  mutex_lock+0x64/0x70
  [  726.629695]  devlink_resource_register+0x50/0x1a0
  [  726.634386]  mlx5_sf_hw_table_init+0xf4/0x2d0 [mlx5_core]
  [  726.639882]  mlx5_init_one_devl_locked+0x1c8/0x784 [mlx5_core]
  [  726.645791]  probe_one+0x300/0x5f0 [mlx5_core]
  [  726.650307]  local_pci_probe+0x48/0xb4
  [  726.654043]  pci_device_probe+0x18c/0x200
  [  726.658039]  really_probe+0xd0/0x490
  [  726.661600]  __driver_probe_device+0x148/0x190
  [  726.666029]  driver_probe_device+0x48/0x180
  [  726.670198]  __driver_attach+0x104/0x240
  [  726.674106]  bus_for_each_dev+0x78/0xdc
  [  726.677927]  driver_attach+0x2c/0x40
  [  726.681486]  bus_add_driver+0x154/0x270
  [  726.685307]  driver_register+0x80/0x13c
  [  726.689129]  __pci_register_driver+0x4c/0x60
  [  726.693386]  __init_backport+0xf0/0x1000 [mlx5_core]
  [  726.698425]  do_one_initcall+0x4c/0x250
  [  726.702248]  do_init_module+0x50/0x260
  [  726.705983]  load_module+0x9fc/0xbe0
  [  726.709543]  __do_sys_finit_module+0xa8/0x114
  [  726.713885]  __arm64_sys_finit_module+0x28/0x3c
  [  726.718401]  invoke_syscall+0x78/0x100
  [  726.722137]  el0_svc_common.constprop.0+0x54/0x184
  [  726.726913]  do_el0_svc+0x30/0xac
  [  726.730215]  el0_svc+0x48/0x160
  [  726.733341]  el0t_64_sync_handler+0xa4/0x130
  [  726.737597]  el0t_64_sync+0x1a4/0x1a8
  [  847.401924] INFO: task systemd-udevd:297 blocked for more than 724 seconds.
  [  847.408891]   Tainted: G   OE 5.15.0-1029-bluefield 
#31-Ubuntu

  How to fix:
  This is related to
  https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2039869
  and we need to backport/cherry-pick more patches from the series

  Patches are below
  Backport: f655dacb59ac net: devlink: remove unused locked functions
  Backport: 012ec02ae441 netdevsim: convert driver to use unlocked devlink API 
during init/fini
  Cherry-pick: eb0e9fa2c635 net: devlink: add unlocked variants of 
devlink_region_create/destroy() functions
  SKIP: 72a4c8c94efa mlxsw: convert driver to use unlocked devlink API during 
init/fini
  Backport: 70a2ff89369d net: devlink: add unlocked variants of 
devlink_dpipe*() functions
  Cherry-pick: 755cfa69c4ec net: devlink: add unlocked variants of 
devlink_sb*() functions
  Cherry-pick: c223d6a4bf6d net: devlink: add unlocked variants of 
devlink_resource*() functions
  Cherry-pick: 852e85a704c2 net: devlink: add unlocked variants of 
devling_trap*() functions
  Cherry-pick: e26fde2f5bef net: devlink: avoid false DEADLOCK warning reported 
by lock

  Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2042455/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2042455] Re: Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init

2023-11-03 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-
bluefield/5.15.0-1031.33 kernel in -proposed solves the problem. Please
test the kernel and update this bug with the results. If the problem is
solved, change the tag 'verification-needed-jammy-linux-bluefield' to
'verification-done-jammy-linux-bluefield'. If the problem still exists,
change the tag 'verification-needed-jammy-linux-bluefield' to
'verification-failed-jammy-linux-bluefield'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-bluefield-v2 
verification-needed-jammy-linux-bluefield

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2042455

Title:
  Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init

Status in linux-bluefield package in Ubuntu:
  Invalid
Status in linux-bluefield source package in Jammy:
  Fix Committed

Bug description:
  Summary:
  Machine hangs when loading OFED 2310 mlx5 driver at BlueField

  How to reproduce:
  # load the OFED driver

  Reason:
  BF got stuck and observed call trace "mlx5_sf_hw_table_init+0xf4/0x2d0 
[mlx5_core]

  dmesg from minicom:
  [  726.569928] INFO: task systemd-udevd:297 blocked for more than 604 seconds.
  [  726.576895]   Tainted: G   OE 5.15.0-1029-bluefield 
#31-Ubuntu
  [  726.584101] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  726.591913] task:systemd-udevd   state:D stack:0 pid:  297 ppid:   280 
flags:0x000d
  [  726.600248] Call trace:
  [  726.602680]  __switch_to+0xf8/0x150
  [  726.606159]  __schedule+0x2b8/0x790
  [  726.609634]  schedule+0x64/0x140
  [  726.612850]  schedule_preempt_disabled+0x18/0x24
  [  726.617453]  __mutex_lock.constprop.0+0x1a0/0x680
  [  726.622141]  __mutex_lock_slowpath+0x40/0x90
  [  726.626396]  mutex_lock+0x64/0x70
  [  726.629695]  devlink_resource_register+0x50/0x1a0
  [  726.634386]  mlx5_sf_hw_table_init+0xf4/0x2d0 [mlx5_core]
  [  726.639882]  mlx5_init_one_devl_locked+0x1c8/0x784 [mlx5_core]
  [  726.645791]  probe_one+0x300/0x5f0 [mlx5_core]
  [  726.650307]  local_pci_probe+0x48/0xb4
  [  726.654043]  pci_device_probe+0x18c/0x200
  [  726.658039]  really_probe+0xd0/0x490
  [  726.661600]  __driver_probe_device+0x148/0x190
  [  726.666029]  driver_probe_device+0x48/0x180
  [  726.670198]  __driver_attach+0x104/0x240
  [  726.674106]  bus_for_each_dev+0x78/0xdc
  [  726.677927]  driver_attach+0x2c/0x40
  [  726.681486]  bus_add_driver+0x154/0x270
  [  726.685307]  driver_register+0x80/0x13c
  [  726.689129]  __pci_register_driver+0x4c/0x60
  [  726.693386]  __init_backport+0xf0/0x1000 [mlx5_core]
  [  726.698425]  do_one_initcall+0x4c/0x250
  [  726.702248]  do_init_module+0x50/0x260
  [  726.705983]  load_module+0x9fc/0xbe0
  [  726.709543]  __do_sys_finit_module+0xa8/0x114
  [  726.713885]  __arm64_sys_finit_module+0x28/0x3c
  [  726.718401]  invoke_syscall+0x78/0x100
  [  726.722137]  el0_svc_common.constprop.0+0x54/0x184
  [  726.726913]  do_el0_svc+0x30/0xac
  [  726.730215]  el0_svc+0x48/0x160
  [  726.733341]  el0t_64_sync_handler+0xa4/0x130
  [  726.737597]  el0t_64_sync+0x1a4/0x1a8
  [  847.401924] INFO: task systemd-udevd:297 blocked for more than 724 seconds.
  [  847.408891]   Tainted: G   OE 5.15.0-1029-bluefield 
#31-Ubuntu

  How to fix:
  This is related to
  https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2039869
  and we need to backport/cherry-pick more patches from the series

  Patches are below
  Backport: f655dacb59ac net: devlink: remove unused locked functions
  Backport: 012ec02ae441 netdevsim: convert driver to use unlocked devlink API 
during init/fini
  Cherry-pick: eb0e9fa2c635 net: devlink: add unlocked variants of 
devlink_region_create/destroy() functions
  SKIP: 72a4c8c94efa mlxsw: convert driver to use unlocked devlink API during 
init/fini
  Backport: 70a2ff89369d net: devlink: add unlocked variants of 
devlink_dpipe*() functions
  Cherry-pick: 755cfa69c4ec net: devlink: add unlocked variants of 
devlink_sb*() functions
  Cherry-pick: c223d6a4bf6d net: devlink: add unlocked variants of 
devlink_resource*() functions
  Cherry-pick: 852e85a704c2 net: devlink: add unlocked variants of 
devling_trap*() functions
  Cherry-pick: e26fde2f5bef net: devlink: avoid false DEADLOCK warning reported 
by lock

  Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2042455/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2042455] Re: Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init

2023-11-02 Thread Bartlomiej Zolnierkiewicz
** Changed in: linux-bluefield (Ubuntu Jammy)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2042455

Title:
  Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init

Status in linux-bluefield package in Ubuntu:
  Invalid
Status in linux-bluefield source package in Jammy:
  Fix Committed

Bug description:
  Summary:
  Machine hangs when loading OFED 2310 mlx5 driver at BlueField

  How to reproduce:
  # load the OFED driver

  Reason:
  BF got stuck and observed call trace "mlx5_sf_hw_table_init+0xf4/0x2d0 
[mlx5_core]

  dmesg from minicom:
  [  726.569928] INFO: task systemd-udevd:297 blocked for more than 604 seconds.
  [  726.576895]   Tainted: G   OE 5.15.0-1029-bluefield 
#31-Ubuntu
  [  726.584101] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  726.591913] task:systemd-udevd   state:D stack:0 pid:  297 ppid:   280 
flags:0x000d
  [  726.600248] Call trace:
  [  726.602680]  __switch_to+0xf8/0x150
  [  726.606159]  __schedule+0x2b8/0x790
  [  726.609634]  schedule+0x64/0x140
  [  726.612850]  schedule_preempt_disabled+0x18/0x24
  [  726.617453]  __mutex_lock.constprop.0+0x1a0/0x680
  [  726.622141]  __mutex_lock_slowpath+0x40/0x90
  [  726.626396]  mutex_lock+0x64/0x70
  [  726.629695]  devlink_resource_register+0x50/0x1a0
  [  726.634386]  mlx5_sf_hw_table_init+0xf4/0x2d0 [mlx5_core]
  [  726.639882]  mlx5_init_one_devl_locked+0x1c8/0x784 [mlx5_core]
  [  726.645791]  probe_one+0x300/0x5f0 [mlx5_core]
  [  726.650307]  local_pci_probe+0x48/0xb4
  [  726.654043]  pci_device_probe+0x18c/0x200
  [  726.658039]  really_probe+0xd0/0x490
  [  726.661600]  __driver_probe_device+0x148/0x190
  [  726.666029]  driver_probe_device+0x48/0x180
  [  726.670198]  __driver_attach+0x104/0x240
  [  726.674106]  bus_for_each_dev+0x78/0xdc
  [  726.677927]  driver_attach+0x2c/0x40
  [  726.681486]  bus_add_driver+0x154/0x270
  [  726.685307]  driver_register+0x80/0x13c
  [  726.689129]  __pci_register_driver+0x4c/0x60
  [  726.693386]  __init_backport+0xf0/0x1000 [mlx5_core]
  [  726.698425]  do_one_initcall+0x4c/0x250
  [  726.702248]  do_init_module+0x50/0x260
  [  726.705983]  load_module+0x9fc/0xbe0
  [  726.709543]  __do_sys_finit_module+0xa8/0x114
  [  726.713885]  __arm64_sys_finit_module+0x28/0x3c
  [  726.718401]  invoke_syscall+0x78/0x100
  [  726.722137]  el0_svc_common.constprop.0+0x54/0x184
  [  726.726913]  do_el0_svc+0x30/0xac
  [  726.730215]  el0_svc+0x48/0x160
  [  726.733341]  el0t_64_sync_handler+0xa4/0x130
  [  726.737597]  el0t_64_sync+0x1a4/0x1a8
  [  847.401924] INFO: task systemd-udevd:297 blocked for more than 724 seconds.
  [  847.408891]   Tainted: G   OE 5.15.0-1029-bluefield 
#31-Ubuntu

  How to fix:
  This is related to
  https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2039869
  and we need to backport/cherry-pick more patches from the series

  Patches are below
  Backport: f655dacb59ac net: devlink: remove unused locked functions
  Backport: 012ec02ae441 netdevsim: convert driver to use unlocked devlink API 
during init/fini
  Cherry-pick: eb0e9fa2c635 net: devlink: add unlocked variants of 
devlink_region_create/destroy() functions
  SKIP: 72a4c8c94efa mlxsw: convert driver to use unlocked devlink API during 
init/fini
  Backport: 70a2ff89369d net: devlink: add unlocked variants of 
devlink_dpipe*() functions
  Cherry-pick: 755cfa69c4ec net: devlink: add unlocked variants of 
devlink_sb*() functions
  Cherry-pick: c223d6a4bf6d net: devlink: add unlocked variants of 
devlink_resource*() functions
  Cherry-pick: 852e85a704c2 net: devlink: add unlocked variants of 
devling_trap*() functions
  Cherry-pick: e26fde2f5bef net: devlink: avoid false DEADLOCK warning reported 
by lock

  Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2042455/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2042455] Re: Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init

2023-11-02 Thread Bartlomiej Zolnierkiewicz
** Also affects: linux-bluefield (Ubuntu Jammy)
   Importance: Undecided
   Status: New

** Changed in: linux-bluefield (Ubuntu)
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2042455

Title:
  Devlink backport: Fix mlx5 driver hangs due to mlx5_sf_hw_table_init

Status in linux-bluefield package in Ubuntu:
  Invalid
Status in linux-bluefield source package in Jammy:
  New

Bug description:
  Summary:
  Machine hangs when loading OFED 2310 mlx5 driver at BlueField

  How to reproduce:
  # load the OFED driver

  Reason:
  BF got stuck and observed call trace "mlx5_sf_hw_table_init+0xf4/0x2d0 
[mlx5_core]

  dmesg from minicom:
  [  726.569928] INFO: task systemd-udevd:297 blocked for more than 604 seconds.
  [  726.576895]   Tainted: G   OE 5.15.0-1029-bluefield 
#31-Ubuntu
  [  726.584101] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [  726.591913] task:systemd-udevd   state:D stack:0 pid:  297 ppid:   280 
flags:0x000d
  [  726.600248] Call trace:
  [  726.602680]  __switch_to+0xf8/0x150
  [  726.606159]  __schedule+0x2b8/0x790
  [  726.609634]  schedule+0x64/0x140
  [  726.612850]  schedule_preempt_disabled+0x18/0x24
  [  726.617453]  __mutex_lock.constprop.0+0x1a0/0x680
  [  726.622141]  __mutex_lock_slowpath+0x40/0x90
  [  726.626396]  mutex_lock+0x64/0x70
  [  726.629695]  devlink_resource_register+0x50/0x1a0
  [  726.634386]  mlx5_sf_hw_table_init+0xf4/0x2d0 [mlx5_core]
  [  726.639882]  mlx5_init_one_devl_locked+0x1c8/0x784 [mlx5_core]
  [  726.645791]  probe_one+0x300/0x5f0 [mlx5_core]
  [  726.650307]  local_pci_probe+0x48/0xb4
  [  726.654043]  pci_device_probe+0x18c/0x200
  [  726.658039]  really_probe+0xd0/0x490
  [  726.661600]  __driver_probe_device+0x148/0x190
  [  726.666029]  driver_probe_device+0x48/0x180
  [  726.670198]  __driver_attach+0x104/0x240
  [  726.674106]  bus_for_each_dev+0x78/0xdc
  [  726.677927]  driver_attach+0x2c/0x40
  [  726.681486]  bus_add_driver+0x154/0x270
  [  726.685307]  driver_register+0x80/0x13c
  [  726.689129]  __pci_register_driver+0x4c/0x60
  [  726.693386]  __init_backport+0xf0/0x1000 [mlx5_core]
  [  726.698425]  do_one_initcall+0x4c/0x250
  [  726.702248]  do_init_module+0x50/0x260
  [  726.705983]  load_module+0x9fc/0xbe0
  [  726.709543]  __do_sys_finit_module+0xa8/0x114
  [  726.713885]  __arm64_sys_finit_module+0x28/0x3c
  [  726.718401]  invoke_syscall+0x78/0x100
  [  726.722137]  el0_svc_common.constprop.0+0x54/0x184
  [  726.726913]  do_el0_svc+0x30/0xac
  [  726.730215]  el0_svc+0x48/0x160
  [  726.733341]  el0t_64_sync_handler+0xa4/0x130
  [  726.737597]  el0t_64_sync+0x1a4/0x1a8
  [  847.401924] INFO: task systemd-udevd:297 blocked for more than 724 seconds.
  [  847.408891]   Tainted: G   OE 5.15.0-1029-bluefield 
#31-Ubuntu

  How to fix:
  This is related to
  https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2039869
  and we need to backport/cherry-pick more patches from the series

  Patches are below
  Backport: f655dacb59ac net: devlink: remove unused locked functions
  Backport: 012ec02ae441 netdevsim: convert driver to use unlocked devlink API 
during init/fini
  Cherry-pick: eb0e9fa2c635 net: devlink: add unlocked variants of 
devlink_region_create/destroy() functions
  SKIP: 72a4c8c94efa mlxsw: convert driver to use unlocked devlink API during 
init/fini
  Backport: 70a2ff89369d net: devlink: add unlocked variants of 
devlink_dpipe*() functions
  Cherry-pick: 755cfa69c4ec net: devlink: add unlocked variants of 
devlink_sb*() functions
  Cherry-pick: c223d6a4bf6d net: devlink: add unlocked variants of 
devlink_resource*() functions
  Cherry-pick: 852e85a704c2 net: devlink: add unlocked variants of 
devling_trap*() functions
  Cherry-pick: e26fde2f5bef net: devlink: avoid false DEADLOCK warning reported 
by lock

  Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2042455/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp