** Description changed:

  This was reported by a hardware partner.  The system set up is a server
  with 512GB RAM and an M.2 NVMe drive as the root filesystem/boot device.
  
  Per the customer, when running the certification Memory Stress test
  (utilizing several stress-ng stressors run in sequence) the system
  freezes with CPU Soft Lockup errors appearing on console whe the "stack"
  stressor is run.
  
  Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe
  (1.9TB).
  
  So far, this only seems to affect the 4.15 kernel.  The tester has tried
  using the 2.5" SATA SSD as the RootFS/Boot device and the tests pass on
  all attempts.  It is ONLY when using the M.2 NVMe as the root / boot
  device that the tests cause a lockup.  The tester is re-trying now with
  the 2.5" NVMe device to see if this only occurs with the M.2 NVMe.
  
  The tester has tried this on the following while using the M.2 NVMe as the 
rootFS/Boot device:
  Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on 
stack stressor
  Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on 
stack stressor
  Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test
  
- 
- The stress-ng command invoked at the time the soft lockups occur is this:
+ The stress-ng command invoked at the time the soft lockups occur is
+ this:
  
  'stress-ng -k --aggressive --verify --timeout 300 --stack 0'
  
  This can be reproduced by running the memory_stress_ng test script from
  the cert suite:
  
  sudo /usr/lib/plainbox-provider-certification-
  server/bin/memory_stress_ng
  
  It may be more easily reproducible running the stack stressor alone, or
  the whole memory stress script without dealing with Checkbox.
+ 
+ UPDATE: The tester also confirms that the 2.5" NVMe drives also fail
+ with the 4.15 kernel and pass with the 4.4 kernel.  The SSD works on all
+ kernels.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1798127

Title:
  CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as
  root FS

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to