Public bug reported:
I am experiencing long test runs on bionic and later series, on a NUMA
machines (that is, on machines that have more than one CPU on the
motherboard).
I discovered this while I was running MAAS stress-ng-cpu tests which
started timing out, on NUMA machines, when I changed commissioning
series to bionic (this is related bug: LP: #1826789).
I was running stress-ng as MAAS runs it, like this:
stress-ng --aggressive -a 0 --class cpu,cpu-cache --ignite-cpu -log-
brief --metrics-brief --times --tz --verify --timeout 1h
Maas actually runs the test for 12 hours, but the issue is visible even
with shorter test runs.
The above command on xenial, regardless of NUMA/non-NUMA machine,
completes within 60 minutes.
But on bionic and later series, on NUMA machines, the tests takes up to 70
minutes, and on disco (with kernel 5) it takes cca 88 minutes for the test to
complete.
On non-NUMA machines (that is, single CPU on the motherboard), the tests
complete within 60 minutes, add 20-30 seconds on that, regardless of the series.
I was thinking that this might be normal behavior on bionic and later
kernels (4.15 and above), but then I installed Centos7 on the test NUMA
machine, and upgraded kernel to v5.0, and compiled latest version of
stress-ng (same as on disco) - the test completed within 60 minutes,
same as on xenial.
** Affects: stress-ng (Ubuntu)
Importance: Undecided
Status: New
** Description changed:
I am experiencing long test runs on bionic and later series, on a NUMA
machines (that is, on machines that have more than one CPU on the
motherboard).
I discovered this while I was running MAAS stress-ng-cpu tests which
started timing out, on NUMA machines, when I changed commissioning
- series to bionic (this is related bug: LP #1826789).
+ series to bionic (this is related bug: LP: #1826789).
I was running stress-ng as MAAS runs it, like this:
stress-ng --aggressive -a 0 --class cpu,cpu-cache --ignite-cpu -log-
brief --metrics-brief --times --tz --verify --timeout 1h
Maas actually runs the test for 12 hours, but the issue is visible even
with shorter test runs.
The above command on xenial, regardless of NUMA/non-NUMA machine,
completes within 60 minutes.
- But on bionic and later series, on NUMA machines, the tests takes up to 70
minutes, and on disco (with kernel 5) it takes cca 88 minutes for the test to
complete.
+ But on bionic and later series, on NUMA machines, the tests takes up to 70
minutes, and on disco (with kernel 5) it takes cca 88 minutes for the test to
complete.
On non-NUMA machines (that is, single CPU on the motherboard), the tests
complete within 60 minutes, add 20-30 seconds on that, regardless of the series.
I was thinking that this might be normal behavior on bionic and later
kernels (4.15 and above), but then I installed Centos7 on the test NUMA
machine, and upgraded kernel to v5.0, and compiled latest version of
stress-ng (same as on disco) - the test completed within 60 minutes,
same as on xenial.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1826791
Title:
stress-ng cpu tests run longer than configured timeout
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/stress-ng/+bug/1826791/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs