Public bug reported:
On Dell PowerEdge system when Ubuntu 22.04.5 OS is installed and
configured with Mellanox Network card, when we Inject MalfTLP error like
modify the MPS value on EP (smaller than MPS on RC/Switch) using setpci
tool observed that the first 3 bytes of TLP headers DW0, DW1, and DW2
has been zeroed out in OS kernel logs.
However, we noticed that the TLP headers in the SEL and lspci logs are
identical.
Steps to Reproduce: -
1. Install Ubuntu 22.04.5 OS.
2. Inject MalfTLP error like modify the MPS value on EP (smaller than MPS on
RC/Switch) using the following commands.
Check the Original MPS value using:
sh# setpci -s 1b:00.0 68.b
If the MPS value is 3f change it to 2f using the following commands.
sh# setpci -s 1b:. 68.w=591f or setpci -s 1b:. 68.b=1f to change MPS
from 256 to 128
3. Check SEL log, lspci -vvv and OS kernel log.
4. The dmesg logs shows the following error.
------------------------------------------------------------------------------------------
sh# dmesg | grep -i "Hardware Error"
[526427.538510] {1}[Hardware Error]: Hardware error from APEI Generic Hardware
Error Source: 5
[526427.546890] {1}[Hardware Error]: event severity: recoverable
[526427.552654] {1}[Hardware Error]: Error 0, type: fatal
[526427.557898] {1}[Hardware Error]: section_type: PCIe error
[526427.563572] {1}[Hardware Error]: port_type: 0, PCIe end point
[526427.569597] {1}[Hardware Error]: version: 3.0
[526427.574232] {1}[Hardware Error]: command: 0x0406, status: 0x0010
[526427.580517] {1}[Hardware Error]: device_id: 0000:1b:00.0
[526427.586106] {1}[Hardware Error]: slot: 40
[526427.590396] {1}[Hardware Error]: secondary_bus: 0x00
[526427.595641] {1}[Hardware Error]: vendor_id: 0x15b3, device_id: 0x101d
[526427.602358] {1}[Hardware Error]: class_code: 020000
[526427.607516] {1}[Hardware Error]: aer_uncor_status: 0x00040000,
aer_uncor_mask: 0x00010000
[526427.615968] {1}[Hardware Error]: aer_uncor_severity: 0x004ef010
[526427.622172] {1}[Hardware Error]: TLP Header: 00000040 00000000 00000000
00000000
-----------------------------------------------------------------------------------------------------
Expected Results: - The TLP headers should be identical as listed in the SEL
and using lspci -s 1b:00.0 -vvv
Actual Results: - The first three bytes DW0, DW1 and DW2 are zeroed out
in OS kernel logs.
** Affects: linux (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2121858
Title:
[Ubuntu 22.04.5 BUG] Ubuntu 22.04.5 OS kernel logging not showing the
TLP header properly.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2121858/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs