Public bug reported:
My problem is similarly described in this old thread:
https://unix.stackexchange.com/questions/742360/
journalctl message: one of the many related logs
Apr 09 15:37:40.096850 ****** kernel: Linux version 6.5.0-26-lowlatency
(buildd@lcy02-amd64-109) (x86_64-linux-gnu-gcc-12 (Ubunntu
12.3.0-1ubuntu1~22.04) 12.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38)
#26.1~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Mar 13 10:41:42 UTC (Ubuntu
6.5.0-26.26.1~22.04.1-lowlatency 6.5.13)
....................
Apr 09 15:43:46.238697 ****** kernel: nvme nvme0: controller is down; will
reset: CSTS=0xffffffff, PCI_STATUS=0x10
Apr 09 15:43:46.239162 ****** kernel: nvme nvme0: Does your device have a
faulty power saving mode enabled?
Apr 09 15:43:46.239266 ****** kernel: nvme nvme0: Try
"nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" and report a bug
Apr 09 15:43:46.690200 ****** kernel: nvme 0000:06:00.0: enabling device (0000
-> 0002)
Apr 09 15:43:46.690409 ****** kernel: nvme nvme0: Disabling device after reset
failure: -19
Apr 09 15:43:46.698188 ****** kernel: I/O error, dev nvme0n1, sector 1216896 op
0x1:(WRITE) flags 0xc800 phys_seg 1 prio clas>
I was using 22.04.4 with hwe kernel, as shown above (kernel 6.5)
upgrade to 24.04 dev hoping the problem would be resolved, but no it still
exists (kernel 6.8)
The problem happens after some kernel upgrades that I'd done after
2024-03-01, but I cannot pinpoint when; the nvme_core kernel param as
shown in the message above does not help.
The problem does NOT exist with 22.04 regular kernel:
Currently I'd created a VM to perform my heavy write workload using pci
passthrough of the NVMe drive, and it works okay. Cannot downgrade host to
older kernel because of ZFS pool being upgraded
VM info (where my NVMe drive works okay)
uname -r
5.15.0-78-lowlatency
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.4 LTS
Release: 22.04
Codename: jammy
(maybe) related hardware spec
CPU: AMD Ryzen 5750G (x8x4x4)
Chipset: AMD B450
NVMe: Samsung MZ1LB960HBJR-000FB (PM983a, f/w EDW73F2Q)
** Affects: linux (Ubuntu)
Importance: Undecided
Status: New
** Description changed:
My problem is similarly described in this old thread:
https://unix.stackexchange.com/questions/742360/
journalctl message: one of the many related logs
- Apr 09 15:37:40.096850 awepet kernel: Linux version 6.5.0-26-lowlatency
(buildd@lcy02-amd64-109) (x86_64-linux-gnu-gcc-12 (Ubunntu
12.3.0-1ubuntu1~22.04) 12.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38)
#26.1~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Mar 13 10:41:42 UTC (Ubuntu
6.5.0-26.26.1~22.04.1-lowlatency 6.5.13)
+ Apr 09 15:37:40.096850 ****** kernel: Linux version 6.5.0-26-lowlatency
(buildd@lcy02-amd64-109) (x86_64-linux-gnu-gcc-12 (Ubunntu
12.3.0-1ubuntu1~22.04) 12.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38)
#26.1~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Mar 13 10:41:42 UTC (Ubuntu
6.5.0-26.26.1~22.04.1-lowlatency 6.5.13)
....................
Apr 09 15:43:46.238697 ****** kernel: nvme nvme0: controller is down; will
reset: CSTS=0xffffffff, PCI_STATUS=0x10
Apr 09 15:43:46.239162 ****** kernel: nvme nvme0: Does your device have a
faulty power saving mode enabled?
Apr 09 15:43:46.239266 ****** kernel: nvme nvme0: Try
"nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" and report a bug
Apr 09 15:43:46.690200 ****** kernel: nvme 0000:06:00.0: enabling device
(0000 -> 0002)
Apr 09 15:43:46.690409 ****** kernel: nvme nvme0: Disabling device after
reset failure: -19
Apr 09 15:43:46.698188 ****** kernel: I/O error, dev nvme0n1, sector 1216896
op 0x1:(WRITE) flags 0xc800 phys_seg 1 prio clas>
I was using 22.04.4 with hwe kernel, as shown above (kernel 6.5)
upgrade to 24.04 dev hoping the problem would be resolved, but no it still
exists (kernel 6.8)
The problem happens after some kernel upgrades that I'd done after
2024-03-01, but I cannot pinpoint when; the nvme_core kernel param as
shown in the message above does not help.
The problem does NOT exist with 22.04 regular kernel:
Currently I'd created a VM to perform my heavy write workload using pci
passthrough of the NVMe drive, and it works okay. Cannot downgrade host to
older kernel because of ZFS pool being upgraded
VM info (where my NVMe drive works okay)
uname -r
5.15.0-78-lowlatency
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.4 LTS
Release: 22.04
Codename: jammy
(maybe) related hardware spec
CPU: AMD Ryzen 5750G (x8x4x4)
Chipset: AMD B450
NVMe: Samsung MZ1LB960HBJR-000FB (PM983a, f/w EDW73F2Q)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2060770
Title:
NVMe drive fails at high write workload after kernel upgrades
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2060770/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs