Hi Jeff,
upstream commit
50b2412b7e78 net/mlx5: Avoid possible free of command entry while timeout comp
handler
was picked to Ubuntu-5.4.0-56.62 kernel
(hash bcd6e98bef76cc8a49a1b736b0fefffbffb75c30)
(v5.4.71 upstream stable release,
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1902110 )
now a new issue arise
reloading mlx5 modules causes an error message in kernel buffer
"cmd_work_handler:887:(pid 292): failed to allocate command entry"
reproduction:
# modprobe -r mlx5_ib mlx5_core
# modprobe mlx5_core mlx5_ib
# dmesg
[ 142.638490] mlx5_core 0000:08:00.1: E-Switch: cleanup
[ 143.734339] mlx5_core 0000:08:00.0: E-Switch: cleanup
[ 164.171511] mlx5_core: unknown parameter 'mlx5_ib' ignored
[ 164.173501] mlx5_core 0000:08:00.0: firmware version: 16.28.1002
[ 164.173576] mlx5_core 0000:08:00.0: 126.016 Gb/s available PCIe bandwidth (8
GT/s x16 link)
[ 164.457342] mlx5_core 0000:08:00.0: Rate limit: 127 rates are supported,
range: 0Mbps to 97656Mbps
[ 164.457365] mlx5_core 0000:08:00.0: E-Switch: Total vports 2, per vport: max
uc(1024) max mc(16384)
[ 164.484659] port_module: 5 callbacks suppressed
[ 164.484665] mlx5_core 0000:08:00.0: Port module event: module 0, Cable
plugged
[ 164.485112] mlx5_core 0000:08:00.0: mlx5_pcie_event:294:(pid 8): PCIe slot
advertised sufficient power (75W).
[ 164.494771] mlx5_core 0000:08:00.1: firmware version: 16.28.1002
[ 164.494844] mlx5_core 0000:08:00.1: 126.016 Gb/s available PCIe bandwidth (8
GT/s x16 link)
[ 164.779534] mlx5_core 0000:08:00.1: Rate limit: 127 rates are supported,
range: 0Mbps to 97656Mbps
[ 164.779552] mlx5_core 0000:08:00.1: E-Switch: Total vports 2, per vport: max
uc(1024) max mc(16384)
[ 164.808886] mlx5_core 0000:08:00.1: Port module event: module 1, Cable
plugged
[ 164.809228] mlx5_core 0000:08:00.1: mlx5_pcie_event:294:(pid 292): PCIe slot
advertised sufficient power (75W).
[ 164.840667] mlx5_core 0000:08:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048)
RxCqeCmprss(0)
[ 165.081342] mlx5_core 0000:08:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048)
RxCqeCmprss(0)
[ 165.282793] mlx5_ib: Mellanox Connect-IB Infiniband driver v5.0-0
[ 165.438226] mlx5_core 0000:08:00.0: cmd_work_handler:887:(pid 292): failed
to allocate command entry
[ 165.442506] infiniband rocep8s0f0: reg_mr_callback:104:(pid 292): async reg
mr failed. status -11
#
the following fixes this issue
410bd754cd73 net/mlx5: Add retry mechanism to the command entry index
allocation (upstream 5.9)
1d5558b1f0de net/mlx5: poll cmd EQ in case of command timeout
(upstream 5.9)
d43b7007dbd1 net/mlx5: Fix a race when moving command interface to events mode
(upstream 5.7-rc7)
3ed879965cc4 net/mlx5: net/mlx5: Use async EQ setup cleanup helpers for
multiple EQs (upstream 5.6-rc1)
those are on master-next branch off focal tree also synced from linux stable.
(v5.4.79 upstream stable release
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907151 )
# git log --oneline Ubuntu-5.4.0-59.65..master-next
....
400ec5bb2816 net/mlx5: Add retry mechanism to the command entry index allocation
2bd608898edd net/mlx5: Fix a race when moving command interface to events mode
bec07c488db0 net/mlx5: poll cmd EQ in case of command timeout
0c9bfdf598e1 net/mlx5: Use async EQ setup cleanup helpers for multiple EQs
.....
I compiled master-next, booted the system with it and the issue is
resolved.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1905574
Title:
Ubuntu 20.10 four needed fixes to 'Add driver for Mellanox Connect-IB
adapters'
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1905574/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs