** Description changed:
+ [Impact]
+
+ * The numad code never considered that node IDs could not be sequential
+ and creates an out of array access.
+
+ * Fix the array index usage to not hit that
+
+ [Test Case]
+
+ 0. The most important and least available ingredient to this issue are sparse
Numa nodes. Usually on your laptop you just have one, on your usual x86 server
you might have more but you usually have 0,1,2,...
+ On powerpc commonly people disable SMT (as that was a KVM requirement up to
p8). This (or other cpu offlining) can lead to numa nodes like:
+ 1,16,30 being the only one left. Only with a setup like that you can follow
and trigger the case.
+ 1. installed numad
+ 2. started the numad service and verified it runs fine
+ 3. I spawned two Guests with 20 cores and 50G each (since there was no
particular guest config mentioned I didn't configure anything special)
+ I used uvtool to get the latest cloud image
+ 4. cloned stressapptest from git [1] in the guests
+ and installed build-essential
+ (my guetss are Bionic and that didn't have stressapptest packaged yet)
+ Built and installed the tool
+ 5. ran the stress in both guests as mentioned
+ $ stressapptest -s 200
+ => This will trigger the crash
+
+
+ [Regression Potential]
+
+ * Without the fix it is severely broken on systems with sparse numa
+ nodes. I imagine you can (with some effort or bad luck) also create
+ such a case on x86, it is not ppc64 specific in general.
+ The code before the fix just works by accident for cpu~=nodid.
+
+ * Obviously the most likely potential regression would be to trigger
+ issues when parsing these arrays on systems that formerly run fine
+ not affected by the sparse node issue. But for non-sparse systems not
+ a lot should change the new code will find for example
+ cpu=1 mapped to node=1 instead of just assuming cpu=1 IS node=1.
+ Therefore I obviously hope for no regression, but that is the one I'd
+ expect if any.
+
+
+ [Other Info]
+
+ * I have submitted this upstream, but upstream seems somewhat dead :-/
+ * do not get crazy when reading the code, nodeid && cpudid are used
+ somewhat interchangeably which might make you go nuts at first (it
+ did for me); but I kept the upstream names as-is for less patch size.
+
+ ----
+
+
== Comment: #0 - SRIKANTH AITHAL <[email protected]> - 2019-02-20
23:42:23 ==
---Problem Description---
while running KVM guests, we are observing numad crashes on host.
-
- Contact Information = srikanth/[email protected]
-
+
+ Contact Information = srikanth/[email protected]
+
---uname output---
Linux ltcgen6 4.15.0-1016-ibm-gt #18-Ubuntu SMP Thu Feb 7 16:58:31 UTC 2019
ppc64le ppc64le ppc64le GNU/Linux
-
- Machine Type = witherspoon
-
+
+ Machine Type = witherspoon
+
---Debugger---
A debugger is not configured
-
+
---Steps to Reproduce---
- 1. check status of numad, if stopped start it
+ 1. check status of numad, if stopped start it
2. start a kvm guest
3. Run some memory tests inside guest
On the host after few minutes we see numad crashing. I had enabled debug
log for numad, seeing below messages in numad.log before it crashes:
8870669: PID 88781: (qemu-system-ppc), Threads 6, MBs_size 15871, MBs_used
11262, CPUs_used 400, Magnitude 4504800, Nodes: 0,8
Thu Feb 21 00:12:10 2019: PICK NODES FOR: PID: 88781, CPUs 470, MBs 18671
Thu Feb 21 00:12:10 2019: PROCESS_MBs[0]: 9201
Thu Feb 21 00:12:10 2019: Node[0]: mem: 0 cpu: 6
Thu Feb 21 00:12:10 2019: Node[1]: mem: 0 cpu: 6
Thu Feb 21 00:12:10 2019: Node[2]: mem: 1878026 cpu: 4666
Thu Feb 21 00:12:10 2019: Node[3]: mem: 0 cpu: 6
Thu Feb 21 00:12:10 2019: Node[4]: mem: 0 cpu: 6
Thu Feb 21 00:12:10 2019: Node[5]: mem: 2194058 cpu: 4728
Thu Feb 21 00:12:10 2019: Totmag[0]: 94112134
Thu Feb 21 00:12:10 2019: Totmag[1]: 109211855
Thu Feb 21 00:12:10 2019: Totmag[2]: 2990058
Thu Feb 21 00:12:10 2019: Totmag[3]: 2990058
Thu Feb 21 00:12:10 2019: Totmag[4]: 2990058
Thu Feb 21 00:12:10 2019: Totmag[5]: 2990058
Thu Feb 21 00:12:10 2019: best_node_ix: 1
Thu Feb 21 00:12:10 2019: Node: 8 Dist: 10 Magnitude: 10373506224
Thu Feb 21 00:12:10 2019: Node: 0 Dist: 40 Magnitude: 8762869316
Thu Feb 21 00:12:10 2019: Node: 253 Dist: 80 Magnitude: 0
Thu Feb 21 00:12:10 2019: Node: 254 Dist: 80 Magnitude: 0
Thu Feb 21 00:12:10 2019: Node: 252 Dist: 80 Magnitude: 0
Thu Feb 21 00:12:10 2019: Node: 255 Dist: 80 Magnitude: 0
Thu Feb 21 00:12:10 2019: MBs: 18671, CPUs: 470
Thu Feb 21 00:12:10 2019: Assigning resources from node 5
Thu Feb 21 00:12:10 2019: Node[0]: mem: 2007348 cpu: 1908
Thu Feb 21 00:12:10 2019: MBs: 0, CPUs: 0
Thu Feb 21 00:12:10 2019: Assigning resources from node 2
Thu Feb 21 00:12:10 2019: Process 88781 already 100 percent localized to
target nodes.
-
On syslog we see sig 11:
[88726.086144] numad[88879]: unhandled signal 11 at 000000e38fe72688 nip
0000782ce4dcac20 lr 0000782ce4dcf85c code 1
+ Stack trace output:
+ no
-
- Stack trace output:
- no
-
Oops output:
- no
-
+ no
+
System Dump Info:
- The system was configured to capture a dump, however a dump was not
produced.
-
- *Additional Instructions for srikanth/[email protected]:
+ The system was configured to capture a dump, however a dump was not
produced.
+
+ *Additional Instructions for srikanth/[email protected]:
-Attach sysctl -a output output to the bug.
== Comment: #2 - SRIKANTH AITHAL <[email protected]> - 2019-02-20
23:44:38 ==
-
== Comment: #3 - SRIKANTH AITHAL <[email protected]> - 2019-02-20
23:48:20 ==
I was using stressapptest to run memory workload inside the guest
`stressapptest -s 200`
== Comment: #5 - Brian J. King <[email protected]> - 2019-03-08 09:17:29 ==
Any update on this?
== Comment: #6 - Leonardo Bras Soares Passos <[email protected]> - 2019-03-08
11:59:16 ==
Yes, I have been working on this for a while.
- After a suggestion of @lagarcia, I tested the bug on the same machine, booted
on default kernel (4.15.0-45-generic) and also booted the vm with the same
generic kernel.
+ After a suggestion of @lagarcia, I tested the bug on the same machine, booted
on default kernel (4.15.0-45-generic) and also booted the vm with the same
generic kernel.
Results are that the bug also happens with 4.15.0-45-generic. So, it may not
be a problem of the changes included on kernel 4.15.0-1016.18-fix1-ibm-gt.
A few things I noticed, that may be interesting to solve this bug:
- I had a very hard time to reproduce the bug on numad that started on boot.
If I restart, or stop/start, the bug reproduces much easier.
- I debugged numad using gdb and I found out it is getting segfault on
_int_malloc(), from glibc.
Attached is an occurrence of the bug, while numad was on gdb.
(systemctl start numad ; gdb /usr/bin/numad $NUMAD_PID)
== Comment: #7 - Leonardo Bras Soares Passos <[email protected]> -
2019-03-08 12:00:00 ==
-
== Comment: #8 - Leonardo Bras Soares Passos <[email protected]> - 2019-03-11
17:04:25 ==
I reverted the whole system to vanilla Ubuntu Bionic, and booted on
4.15.0-45-generic kernel.
Linux ltcgen6 4.15.0-45-generic #48-Ubuntu SMP Tue Jan 29 16:27:02 UTC 2019
ppc64le ppc64le ppc64le GNU/Linux
Then I booted the guest, also on 4.15.0-45-generic.
Linux ubuntu 4.15.0-45-generic #48-Ubuntu SMP Tue Jan 29 16:27:02 UTC 2019
ppc64le ppc64le ppc64le GNU/Linux
I tried to reproduce the error, and I was able to.
It probably means this bug was not introduced by the changes on qemu/kernel,
and it is present in the current repository of Ubuntu.
Next step should be doing a deeper debug on numad, in order to identify
why it is getting segfault.
** Changed in: numad (Ubuntu Bionic)
Status: New => In Progress
** Changed in: numad (Ubuntu Disco)
Status: New => In Progress
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1832915
Title:
numad crashes while running kvm guest
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1832915/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs