[Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
Marked all tasks invalid. It's been over 2 years and no further update from the tester who reported this initially. One can only presume it's been resolved by some update to the kernel, or was not kernel related and resolved itself by other means. If this appears again, we'll open a fresh bug. ** Changed in: linux (Ubuntu Cosmic) Status: Triaged => Invalid ** Changed in: linux (Ubuntu Cosmic) Assignee: Joseph Salisbury (jsalisbury) => (unassigned) ** Changed in: linux (Ubuntu Bionic) Status: Triaged => Invalid ** Changed in: linux (Ubuntu) Status: Triaged => Invalid ** Changed in: linux (Ubuntu) Assignee: Joseph Salisbury (jsalisbury) => (unassigned) ** Changed in: linux (Ubuntu Bionic) Assignee: Joseph Salisbury (jsalisbury) => (unassigned) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Invalid Status in linux source package in Xenial: Invalid Status in linux source package in Bionic: Invalid Status in linux source package in Cosmic: Invalid Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test The stress-ng command invoked at the time the soft lockups occur is this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. UPDATE: The tester also confirms that the 2.5" NVMe drives also fail with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on all kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
** Tags added: cscc -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Triaged Status in linux source package in Xenial: Invalid Status in linux source package in Bionic: Triaged Status in linux source package in Cosmic: Triaged Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test The stress-ng command invoked at the time the soft lockups occur is this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. UPDATE: The tester also confirms that the 2.5" NVMe drives also fail with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on all kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
Re: [Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
Ahhh, ok. Thanks. That makes sense, then. On Fri, Nov 16, 2018 at 12:01 AM Joseph Salisbury wrote: > > The Xenial bug task is for the base Xenial kernel version 4.4. Any > commits/fixes applied to Bionic flow down into Xenial HWE because Bionic > is 4.15 based, which is the source for Xenial HWE. > > -- > You received this bug notification because you are subscribed to the bug > report. > https://bugs.launchpad.net/bugs/1798127 > > Title: > CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as > root FS > > Status in linux package in Ubuntu: > Triaged > Status in linux source package in Xenial: > Invalid > Status in linux source package in Bionic: > Triaged > Status in linux source package in Cosmic: > Triaged > > Bug description: > This was reported by a hardware partner. The system set up is a > server with 512GB RAM and an M.2 NVMe drive as the root > filesystem/boot device. > > Per the customer, when running the certification Memory Stress test > (utilizing several stress-ng stressors run in sequence) the system > freezes with CPU Soft Lockup errors appearing on console whe the > "stack" stressor is run. > > Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe > (1.9TB). > > So far, this only seems to affect the 4.15 kernel. The tester has > tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests > pass on all attempts. It is ONLY when using the M.2 NVMe as the root > / boot device that the tests cause a lockup. The tester is re-trying > now with the 2.5" NVMe device to see if this only occurs with the M.2 > NVMe. > > The tester has tried this on the following while using the M.2 NVMe as the > rootFS/Boot device: > Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on > stack stressor > Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on > stack stressor > Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test > > The stress-ng command invoked at the time the soft lockups occur is > this: > > 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' > > This can be reproduced by running the memory_stress_ng test script > from the cert suite: > > sudo /usr/lib/plainbox-provider-certification- > server/bin/memory_stress_ng > > It may be more easily reproducible running the stack stressor alone, > or the whole memory stress script without dealing with Checkbox. > > UPDATE: The tester also confirms that the 2.5" NVMe drives also fail > with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on > all kernels. > > To manage notifications about this bug go to: > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions > > Launchpad-Notification-Type: bug > Launchpad-Bug: distribution=ubuntu; sourcepackage=linux; component=main; > status=Triaged; importance=High; assignee=joseph.salisb...@canonical.com; > Launchpad-Bug: distribution=ubuntu; distroseries=xenial; sourcepackage=linux; > component=main; status=Invalid; importance=Undecided; assignee=None; > Launchpad-Bug: distribution=ubuntu; distroseries=bionic; sourcepackage=linux; > component=main; status=Triaged; importance=High; > assignee=joseph.salisb...@canonical.com; > Launchpad-Bug: distribution=ubuntu; distroseries=cosmic; sourcepackage=linux; > component=main; status=Triaged; importance=High; > assignee=joseph.salisb...@canonical.com; > Launchpad-Bug-Tags: blocks-hwcert-server kernel-da-key kernel-fixed-upstream > Launchpad-Bug-Information-Type: Public > Launchpad-Bug-Private: no > Launchpad-Bug-Security-Vulnerability: no > Launchpad-Bug-Commenters: acduroy bladernr jsalisbury ubuntu-kernel-bot > Launchpad-Bug-Reporter: Jeff Lane (bladernr) > Launchpad-Bug-Modifier: Joseph Salisbury (jsalisbury) > Launchpad-Message-Rationale: Subscriber > Launchpad-Message-For: bladernr -- Jeff Lane Technical Partnership and Server Certification Programmes "Entropy isn't what it used to be." -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Triaged Status in linux source package in Xenial: Invalid Status in linux source package in Bionic: Triaged Status in linux source package in Cosmic: Triaged Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seem
[Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
The Xenial bug task is for the base Xenial kernel version 4.4. Any commits/fixes applied to Bionic flow down into Xenial HWE because Bionic is 4.15 based, which is the source for Xenial HWE. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Triaged Status in linux source package in Xenial: Invalid Status in linux source package in Bionic: Triaged Status in linux source package in Cosmic: Triaged Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test The stress-ng command invoked at the time the soft lockups occur is this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. UPDATE: The tester also confirms that the 2.5" NVMe drives also fail with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on all kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
Why is Xenial marked invalid? AFAIK this affects 4.15, but not 4.4, which means Xenial HWE also has this regression. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Triaged Status in linux source package in Xenial: Invalid Status in linux source package in Bionic: Triaged Status in linux source package in Cosmic: Triaged Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test The stress-ng command invoked at the time the soft lockups occur is this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. UPDATE: The tester also confirms that the 2.5" NVMe drives also fail with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on all kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
Would it be possible for you to test the latest upstream stable 4.18? It can be downloaded from: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.18.19/ ** Tags removed: kernel-key ** Tags added: kernel-da-key ** Changed in: linux (Ubuntu Bionic) Assignee: (unassigned) => Joseph Salisbury (jsalisbury) ** Changed in: linux (Ubuntu Bionic) Status: Confirmed => Triaged ** Also affects: linux (Ubuntu Cosmic) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Cosmic) Status: New => Triaged ** Changed in: linux (Ubuntu Cosmic) Importance: Undecided => High ** Changed in: linux (Ubuntu Cosmic) Assignee: (unassigned) => Joseph Salisbury (jsalisbury) ** Also affects: linux (Ubuntu Xenial) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Xenial) Status: New => Invalid ** Changed in: linux (Ubuntu) Status: Confirmed => Triaged ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Joseph Salisbury (jsalisbury) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Triaged Status in linux source package in Xenial: Invalid Status in linux source package in Bionic: Triaged Status in linux source package in Cosmic: Triaged Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test The stress-ng command invoked at the time the soft lockups occur is this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. UPDATE: The tester also confirms that the 2.5" NVMe drives also fail with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on all kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
** Tags added: kernel-fixed-upstream ** Changed in: linux (Ubuntu Bionic) Status: Triaged => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Confirmed Status in linux source package in Bionic: Confirmed Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test The stress-ng command invoked at the time the soft lockups occur is this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. UPDATE: The tester also confirms that the 2.5" NVMe drives also fail with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on all kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
I added a nomination for Xenial, but that may not be the right way (I'm not sure how to add a Xenial task like the Bionic task already added. But this is a regression in as much as on Xenial this does not occur on the 4.4 kernel but shows up at least in the 4.15 HWE kernel. For Bionic it seems to exist in both 4.15 and 4.18 (in cosmic). So any fix that is pulled into 4.18 will need to be pulled into 4.15 for both Bionic and Xenial. Thanks! -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Confirmed Status in linux source package in Bionic: Triaged Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test The stress-ng command invoked at the time the soft lockups occur is this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. UPDATE: The tester also confirms that the 2.5" NVMe drives also fail with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on all kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
Successfully upgraded to 4.19. Stress-ng memory test passed without lockup issue. New rebuild kernel 4.19 has fixed the issue. Refer to text below: --- ubuntu@fluent-orca:~$ uname -r 4.19.0-041900rc8-generic ubuntu@fluent-orca:~$ sudo stress-ng -k --aggressive --verify --timeout 300 --stack 0 stress-ng: info: [3516] dispatching hogs: 112 stack stress-ng: info: [3516] successful run completed in 311.37s (5 mins, 11.37 secs) ubuntu@fluent-orca:~$ -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Confirmed Status in linux source package in Bionic: Triaged Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test The stress-ng command invoked at the time the soft lockups occur is this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. UPDATE: The tester also confirms that the 2.5" NVMe drives also fail with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on all kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
'kernel-fixed-upstream' -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Confirmed Status in linux source package in Bionic: Triaged Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test The stress-ng command invoked at the time the soft lockups occur is this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. UPDATE: The tester also confirms that the 2.5" NVMe drives also fail with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on all kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
** Changed in: linux (Ubuntu) Status: Triaged => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Confirmed Status in linux source package in Bionic: Triaged Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test The stress-ng command invoked at the time the soft lockups occur is this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. UPDATE: The tester also confirms that the 2.5" NVMe drives also fail with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on all kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
** Description changed: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test - - The stress-ng command invoked at the time the soft lockups occur is this: + The stress-ng command invoked at the time the soft lockups occur is + this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. + + UPDATE: The tester also confirms that the 2.5" NVMe drives also fail + with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on all + kernels. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test The stress-ng command invoked at the time the soft lockups occur is this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. UPDATE: The tester also confirms that the 2.5" NVMe drives also fail with the 4.15 kernel and pass with the 4.4 kernel. The SSD works on all kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1798127] Re: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS
Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.19 kernel[0]. If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'. If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'. Once testing of the upstream kernel is complete, please mark this bug as "Confirmed". Thanks in advance. [0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.19-rc8 ** Changed in: linux (Ubuntu) Importance: Undecided => Medium ** Changed in: linux (Ubuntu) Status: Incomplete => Triaged ** Also affects: linux (Ubuntu Bionic) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Bionic) Status: New => Triaged ** Changed in: linux (Ubuntu Bionic) Importance: Undecided => Medium ** Tags added: kernel-da-key ** Changed in: linux (Ubuntu Bionic) Importance: Medium => High ** Changed in: linux (Ubuntu) Importance: Medium => High ** Tags removed: kernel-da-key ** Tags added: kernel-key -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1798127 Title: CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as root FS Status in linux package in Ubuntu: Triaged Status in linux source package in Bionic: Triaged Bug description: This was reported by a hardware partner. The system set up is a server with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device. Per the customer, when running the certification Memory Stress test (utilizing several stress-ng stressors run in sequence) the system freezes with CPU Soft Lockup errors appearing on console whe the "stack" stressor is run. Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe (1.9TB). So far, this only seems to affect the 4.15 kernel. The tester has tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on all attempts. It is ONLY when using the M.2 NVMe as the root / boot device that the tests cause a lockup. The tester is re-trying now with the 2.5" NVMe device to see if this only occurs with the M.2 NVMe. The tester has tried this on the following while using the M.2 NVMe as the rootFS/Boot device: Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on stack stressor Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test The stress-ng command invoked at the time the soft lockups occur is this: 'stress-ng -k --aggressive --verify --timeout 300 --stack 0' This can be reproduced by running the memory_stress_ng test script from the cert suite: sudo /usr/lib/plainbox-provider-certification- server/bin/memory_stress_ng It may be more easily reproducible running the stack stressor alone, or the whole memory stress script without dealing with Checkbox. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp