Public bug reported:
After reassign RoCE CX5 Mojawe Card Pchid to another LPAR dmesg show
under Ubuntu the following Error message
Ubuntu 20.04.01 with updates
oot@t35lp02:~# uname -a
Linux t35lp02.lnxne.boe 5.4.0-80-generic #90-Ubuntu SMP Fri Jul 9 17:41:33 UTC
2021 s390x s390x s390x GNU/Linux
root@t35lp02:~#
DMESG Output
761.778422] mlx5_core 0018:00:00.1: poll_health:715:(pid 0): Fatal error 1
detected
[ 761.778432] mlx5_core 0018:00:00.1: print_health_info:381:(pid 0):
assert_var[0] 0xffffffff
[ 761.778435] mlx5_core 0018:00:00.1: print_health_info:381:(pid 0):
assert_var[1] 0xffffffff
[ 761.778437] mlx5_core 0018:00:00.1: print_health_info:381:(pid 0):
assert_var[2] 0xffffffff
[ 761.778439] mlx5_core 0018:00:00.1: print_health_info:381:(pid 0):
assert_var[3] 0xffffffff
[ 761.778442] mlx5_core 0018:00:00.1: print_health_info:381:(pid 0):
assert_var[4] 0xffffffff
[ 761.778444] mlx5_core 0018:00:00.1: print_health_info:384:(pid 0):
assert_exit_ptr 0xffffffff
[ 761.778447] mlx5_core 0018:00:00.1: print_health_info:386:(pid 0):
assert_callra 0xffffffff
[ 761.778451] mlx5_core 0018:00:00.1: print_health_info:389:(pid 0): fw_ver
65535.65535.65535
[ 761.778454] mlx5_core 0018:00:00.1: print_health_info:390:(pid 0): hw_id
0xffffffff
[ 761.778456] mlx5_core 0018:00:00.1: print_health_info:391:(pid 0):
irisc_index 255
[ 761.778460] mlx5_core 0018:00:00.1: print_health_info:392:(pid 0): synd
0xff: unrecognized error
[ 761.778462] mlx5_core 0018:00:00.1: print_health_info:394:(pid 0): ext_synd
0xffff
[ 761.778465] mlx5_core 0018:00:00.1: print_health_info:396:(pid 0): raw
fw_ver 0xffffffff
[ 761.778467] mlx5_core 0018:00:00.1: mlx5_trigger_health_work:696:(pid 0):
new health works are not permitted at this stage
[ 763.179016] mlx5_core 0018:00:00.1: E-Switch: cleanup
[ 768.348431] mlx5_core 0018:00:00.1: mlx5_reclaim_startup_pages:562:(pid
123): FW did not return all pages. giving up...
[ 768.348433] ------------[ cut here ]------------
[ 768.348434] FW pages counter is 43318 after reclaiming all pages
[ 768.348562] WARNING: CPU: 0 PID: 123 at
drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:567
mlx5_reclaim_startup_pages+0x12c/0x1c0 [mlx5_core]
[ 768.348563] Modules linked in: s390_trng chsc_sch eadm_sch vfio_ccw
vfio_mdev mdev vfio_iommu_type1 vfio sch_fq_codel drm
drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress
zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_service_time mlx5_ib
ib_uverbs ib_core pkey qeth_l2 zcrypt crc32_vx_s390 ghash_s390 prng aes_s390
des_s390 libdes sha3_512_s390 sha3_256_s390 sha512_s390 mlx5_core sha256_s390
sha1_s390 sha_common tls mlxfw ptp pps_core zfcp scsi_transport_fc
dasd_eckd_mod dasd_mod qeth qdio ccwgroup scsi_dh_emc scsi_dh_rdac scsi_dh_alua
dm_multipath
[ 768.348586] CPU: 0 PID: 123 Comm: kmcheck Tainted: G W
5.4.0-80-generic #90-Ubuntu
[ 768.348586] Hardware name: IBM 8561 T01 703 (LPAR)
[ 768.348587] Krnl PSW : 0704c00180000000 000003ff808d33ac
(mlx5_reclaim_startup_pages+0x12c/0x1c0 [mlx5_core])
[ 768.348607] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0
RI:0 EA:3
[ 768.348608] Krnl GPRS: 0000000000000004 0000000000000006 0000000000000034
0000000000000007
[ 768.348608] 0000000000000007 00000000fcb4fa00 000000000000007b
000003e00458fafc
[ 768.348609] 00000000b7d406f0 000000004d849c00 00000000b7d00120
000000010000b6c0
[ 768.348610] 00000000f4cb1100 000003e00458fe70 000003ff808d33a8
000003e00458fa50
[ 768.348615] Krnl Code: 000003ff808d339c: c02000041043 larl
%r2,000003ff80955422
000003ff808d33a2: c0e5ffff70c7 brasl
%r14,000003ff808c1530
#000003ff808d33a8: a7f40001 brc
15,000003ff808d33aa
>000003ff808d33ac: e330a5f04012 lt
%r3,263664(%r10)
000003ff808d33b2: a784ffd7 brc
8,000003ff808d3360
000003ff808d33b6: b9140033 lgfr %r3,%r3
000003ff808d33ba: c0200004104e larl
%r2,000003ff80955456
000003ff808d33c0: c0e5ffff70b8 brasl
%r14,000003ff808c1530
[ 768.348622] Call Trace:
[ 768.348641] ([<000003ff808d33a8>] mlx5_reclaim_startup_pages+0x128/0x1c0
[mlx5_core])
[ 768.348661] [<000003ff808c8e14>] mlx5_function_teardown+0x44/0xa0
[mlx5_core]
[ 768.348680] [<000003ff808c95b0>] mlx5_unload_one+0x80/0x160 [mlx5_core]
[ 768.348699] [<000003ff808c9720>] remove_one+0x50/0xd0 [mlx5_core]
[ 768.348702] [<000000004d2704c0>] pci_device_remove+0x40/0xa0
[ 768.348706] [<000000004d2f724e>] device_release_driver_internal+0xee/0x1c0
[ 768.348707] [<000000004d267054>] pci_stop_bus_device+0x94/0xc0
[ 768.348708] [<000000004d267210>]
pci_stop_and_remove_bus_device_locked+0x30/0x50
[ 768.348710] [<000000004cd36cbe>] __zpci_event_availability+0x26e/0x340
[ 768.348713] [<000000004d382794>] chsc_process_crw+0x2e4/0x300
[ 768.348714] [<000000004d389fd6>] crw_collect_info+0x276/0x340
[ 768.348716] [<000000004cd681e6>] kthread+0x126/0x160
[ 768.348719] [<000000004d5a568c>] ret_from_fork+0x28/0x30
[ 768.348720] [<000000004d5a5694>] kernel_thread_starter+0x0/0x10
[ 768.348720] Last Breaking-Event-Address:
[ 768.348739] [<000003ff808d33a8>] mlx5_reclaim_startup_pages+0x128/0x1c0
[mlx5_core]
[ 768.348740] ---[ end trace 1056779ff3084977 ]---
[ 768.354255] pci 0018:00:00.1: Removing from iommu group 2
[ 768.359097] pci_bus 0018:00: busn_res: [bus 00] is released
[ 768.359122] crw_info : CRW reports slct=0, oflw=0, chn=0, rsc=B, anc=0,
erc=0, rsid=0
root@t35lp02:~#
== Comment: #2 - [email protected]> - 2021-07-27 08:43:37 ==
Make an Update to Ubuntu 21.04 as mentioned with Niklas:
root@t35lp02:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 21.04
Release: 21.04
Codename: hirsute
root@t35lp02:~# ls -la
total 36
drwx------ 5 root root 4096 Jul 27 13:06 .
drwxr-xr-x 20 root root 4096 Jul 27 12:49 ..
-rw------- 1 root root 174 Jul 27 13:15 .bash_history
-rw-r--r-- 1 root root 3106 Dec 5 2019 .bashrc
drwx------ 2 root root 4096 Jul 27 12:58 .cache
-rw-r--r-- 1 root root 161 Dec 5 2019 .profile
drwxr-xr-x 3 root root 4096 Jul 27 12:58 snap
drwx------ 2 root root 4096 Jul 27 12:58 .ssh
-rw------- 1 root root 979 Jul 27 13:06 .viminfo
root@t35lp02:~# uname -a
Linux t35lp02.lnxne.boe 5.11.0-25-generic #27-Ubuntu SMP Fri Jul 9 18:40:37 UTC
2021 s390x s390x s390x GNU/Linux
root@t35lp02:~#
dmesg show the following Call Trace after reasign Mojawe Ports to
another LPAR
[ 232.218778] mlx5_core 0008:00:00.1: mlx5_wait_for_pages:735:(pid 140):
Skipping wait for vf pages stage
[ 234.108700] mlx5_core 0008:00:00.1: E-Switch: cleanup
[ 234.281483] pci 0008:00:00.1: Removing from iommu group 1
[ 234.281510] ------------[ cut here ]------------
[ 234.281511] WARNING: CPU: 6 PID: 140 at arch/s390/pci/pci.c:374
pcibios_release_device+0xfe/0x110
[ 234.281522] Modules linked in: s390_trng chsc_sch eadm_sch vfio_ccw
vfio_mdev mdev vfio_iommu_type1 vfio sch_fq_codel drm i2c_core
drm_panel_orientation_quirks ip_tables x_tables btrfs blake2b_generic
zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_service_time mlx5_ib
ib_uverbs ib_core qeth_l2 pkey zcrypt crc32_vx_s390 ghash_s390 mlx5_core prng
aes_s390 des_s390 libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390
sha1_s390 sha_common tls mlxfw ptp pps_core zfcp scsi_transport_fc
dasd_eckd_mod dasd_mod qeth qdio ccwgroup scsi_dh_emc scsi_dh_rdac scsi_dh_alua
dm_multipath
[ 234.281573] CPU: 6 PID: 140 Comm: kmcheck Not tainted 5.11.0-25-generic
#27-Ubuntu
[ 234.281575] Hardware name: IBM 8561 T01 703 (LPAR)
[ 234.281576] Krnl PSW : 0704c00180000000 00000000d2af2e92
(pcibios_release_device+0x102/0x110)
[ 234.281581] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0
RI:0 EA:3
[ 234.281582] Krnl GPRS: 000000000000001f 00000000ffffffff 00000000839dcc00
0000000000000000
[ 234.281584] 0000000000000000 0038008000000000 0000000000000000
0000038000000000
[ 234.281585] 00000000d43efef0 0000000000000006 0000000000000c00
00000000839f5298
[ 234.281586] 0000000083563300 0000000000000000 000003800475fad8
000003800475fa88
[ 234.281596] Krnl Code: 00000000d2af2e84: c0e5003c400e brasl
%r14,00000000d327aea0
00000000d2af2e8a: a7f4ffc3 brc
15,00000000d2af2e10
#00000000d2af2e8e: af000000 mc 0,0
>00000000d2af2e92: e5442006ffff mvhhi
6(%r2),-1
00000000d2af2e98: a7f4ffe2 brc
15,00000000d2af2e5c
00000000d2af2e9c: 0707 bcr 0,%r7
00000000d2af2e9e: 0707 bcr 0,%r7
00000000d2af2ea0: c00400000000 brcl
0,00000000d2af2ea0
[ 234.281605] Call Trace:
[ 234.281607] [<00000000d2af2e92>] pcibios_release_device+0x102/0x110
[ 234.281612] [<00000000d328fa42>] pci_release_dev+0x62/0xa0
[ 234.281620] [<00000000d335f918>] device_release+0x48/0xb0
[ 234.281626] [<00000000d32604d4>] kobject_put+0x174/0x2e0
[ 234.281632] [<00000000d32cfeee>] pci_iov_release+0x4e/0x70
[ 234.281637] [<00000000d328fa2e>] pci_release_dev+0x4e/0xa0
[ 234.281638] [<00000000d335f918>] device_release+0x48/0xb0
[ 234.281640] [<00000000d32604d4>] kobject_put+0x174/0x2e0
[ 234.281642] [<00000000d2af870a>] __zpci_event_availability+0x20a/0x350
[ 234.281644] [<00000000d36c5d60>] chsc_process_crw+0x2d0/0x2f0
[ 234.281650] [<00000000d36cff36>] crw_collect_info+0x276/0x340
[ 234.281653] [<00000000d2b3c4fa>] kthread+0x14a/0x170
[ 234.281657] [<00000000d3728028>] ret_from_fork+0x24/0x2c
[ 234.281662] Last Breaking-Event-Address:
[ 234.281662] [<00000000d2af2e2e>] pcibios_release_device+0x9e/0x110
[ 234.281664] ---[ end trace c37123f53d0bbb72 ]---
[ 236.519017] crw_info : CRW reports slct=0, oflw=0, chn=0, rsc=B, anc=0,
erc=0, rsid=0
root@t35lp02:~#
== Comment: #3 - [email protected]> - 2021-08-04 05:31:00 ==
The first dmesg on Ubuntu 20.04 looks like a Mellanox internal driver problem
which if my memory serves me correctly has since been fixed in the Mellanox
driver. As far as I can tell in the worst case this leaks a few pages.
The output on Ubuntu 21.04 turns out to be a NULL pointer dereference in
zPCI code however that was previously hidden by us leaking the struct
pci_dev.
I have analyzed the issue and can reproduce it on current development
kernels, here is what I believe happens:
The backtrace shows a warning in
pcibios_release_device()
zpci_unmap_resources()
pci_iounmap_fh()
which is
WARN_ON(!zpci_iomap_start[idx].count);
That however is a red herring, on this z15 machine with a Mojave we have
MIO support so should never even enter pci_iounmap_fh().
Adding a debug print in pcibios_release_device() it turns out that the
struct zpci_dev * we get to to_zpci(pdev) is NULL.
Digging a bit the problem is that during the detach PCI availability event
we call zpci_zdev_put() as the zdev went away. We already performed the
pci_stop_and_remove_bus_device_locked(pdev) before that point and assumed that
after that the struct pci_dev refcount reaches 0 and will not be accessed
anymore.
This is usually true, however here the problem is that we first removed
the PF for Port 1 while keeping the PF for Port 2.
Now the "struct pci_sriov" in pdev->sriov where pdev is the PF of the Port 2
has a field sturct pci_sriov::dev with the comment "/* Lowest numbered PF */".
This field holds a reference to the struct pci_dev of the PF for Port 1 thus
preventing
the refcount of that reaching 0 until the PF for Port 2 is released.
When the PF for Port 2 is released the refcount for the PF of Port 1 reaches 0
and only then do we get the call pci_release_dev() -> pcibios_release_device()
but at this point the struct zpci_dev was already released and
zbus->functions[devfn]
pointer NULLed when it was unregistered from the zbus.
Here is /sys/kernel/debug/s390dbf/pci_msg/sprintf output with the added debug
print for my reproduction:
root@t35lp47 ~ # cat /sys/kernel/debug/s390dbf/pci_msg/sprintf
00 01627041292:161310 3 - 0007 0000000b3ffa6560 wb bit: 1
...
00 01627041292:165772 3 - 0007 0000000b3ffa2952 add fid:280, fh:2f80, c:1
00 01627041292:166083 3 - 0007 0000000b3ffa2952 add fid:2c0, fh:3002, c:1
...
00 01627041292:181187 3 - 0007 0000000b3ffa66aa ena fid:280, fh:a3002f80, rc:0
00 01627041292:181194 3 - 0007 0000000b3ffa6728 ena mio fid:280, fh:a3002f80,
rc:0 <-- MIO enabled for PF of Port 1
00 01627041292:182176 3 - 0007 0000000b3ffa66aa ena fid:2c0, fh:a7003002, rc:0
00 01627041292:182181 3 - 0007 0000000b3ffa6728 ena mio fid:2c0, fh:a7003002,
rc:0 <-- MIO enabled for PF of Port 2
....
00 01627041423:815382 3 - 0014 0000000b3ffa2c74 rem fid:280 <-- sturct
zpci_dev for Port 1 released but no zpci_unmap_resources() called for it!
00 01627041566:352720 3 - 0012 0000000b3ffa2362 zunmap: zdev:0000000000000000
FID:0, mio:0 <-- zdev is NULL in pcibios_release_device() for Port 1 and we
read the FH/MIO from somewhere in lowcore BAD!
00 01627041566:353274 3 - 0012 0000000b3ffa2362 zunmap: zdev:000000008803c000
FID:2c0, mio:1
00 01627041567:548896 3 - 0012 0000000b3ffa2c74 rem fid:2c0
Thus we have a definite bug in the coordination between the lifetimes of
struct zpci_dev and struct pci_dev where the former can outlive the latter.
I think the problem is that for the struct zpci_dev we only keep exactly one
reference owned by the zPCI core that gets released once the underlying zPCI
device goes away from the view of the zPCI core.
At the same time the struct pci_dev has its own reference counting and via
pdev holds its own reference to the struct zpci_dev (indirect via
pdev->sysdata which is a strict zpci_bus which holds a struct zpci_dev* for
all functions on the bus via zbus->functions[devfn]).
This reference is unaccounted for and can outlive the zPCI core's refrence
as seen in the above scenario.
== Comment: #4 - [email protected]> - 2021-08-24 09:00:03 ==
A fix for this has now landed upstream with the following commit:
2a671f77ee49 ("s390/pci: fix use after free of zpci_dev")
Note that this references an earlier commit that previously hid the issue
and has not yet been merged to Ubuntu 20.04's kernel but is included in 21.04
only with both fixes does the freeing of the struct pci_dev for correctly
in the tested case.
0b13525c20fe ("s390/pci: fix leak of PCI device structure")
** Affects: linux (Ubuntu)
Importance: Undecided
Assignee: Skipper Bug Screeners (skipper-screen-team)
Status: New
** Tags: architecture-s39064 bugnameltc-193748 severity-low
targetmilestone-inin---
** Tags added: architecture-s39064 bugnameltc-193748 severity-low
targetmilestone-inin---
** Changed in: ubuntu
Assignee: (unassigned) => Skipper Bug Screeners (skipper-screen-team)
** Package changed: ubuntu => linux (Ubuntu)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1943464
Title:
Reassign I/O Path of Mojave Port 1 before Port 2 causes NULL
dereference
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1943464/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs