** Description changed:
SRU Justification:
[ Impact ]
- * While running a (nested) KVM guest on Power 10 (with PowerVM)
- and performing a CPU hotplug, trying to set to 68 vCPUs,
- the KVM guest crashes.
+ * While running a (nested) KVM guest on Power 10 (with PowerVM)
+ and performing a CPU hotplug, trying to set to 68 vCPUs,
+ the KVM guest crashes.
- * In the failure case the KVM guest has maxvcpus 128,
- and it starts fine with an initial value of 4 vCPUs,
- but fails after a larger increase (here to 68 vCPUs).
+ * In the failure case the KVM guest has maxvcpus 128,
+ and it starts fine with an initial value of 4 vCPUs,
+ but fails after a larger increase (here to 68 vCPUs).
- * The error reported is:
- [ 662.102542] KVM: Create Guest vcpu hcall failed, rc=-44
- error: Unable to read from monitor: Connection reset by peer
+ * The error reported is:
+ [ 662.102542] KVM: Create Guest vcpu hcall failed, rc=-44
+ error: Unable to read from monitor: Connection reset by peer
- * This especially seems to happen in memory constraint systems.
+ * This especially seems to happen in memory constraint systems.
- * This can be avoided by pre-creating and parking vCPUs on success
- or return error otherwise, which then leads to a graceful error
- in case of a vCPU hotplug failure, while the guest keeps running.
+ * This can be avoided by pre-creating and parking vCPUs on success
+ or return error otherwise, which then leads to a graceful error
+ in case of a vCPU hotplug failure, while the guest keeps running.
[ Fix ]
- * 08c3286822 ("accel/kvm: Extract common KVM vCPU {creation,parking}
+ * 08c3286822 ("accel/kvm: Extract common KVM vCPU {creation,parking}
code") [pre-req]
- * c6a3d7bc9e ("accel/kvm: Introduce kvm_create_and_park_vcpu() helper")
+ * c6a3d7bc9e ("accel/kvm: Introduce kvm_create_and_park_vcpu() helper")
- * 18530e7c57 ("cpu-common.c: export cpu_get_free_index to be reused
+ * 18530e7c57 ("cpu-common.c: export cpu_get_free_index to be reused
later")
- * cfb52d07f5 ("target/ppc: handle vcpu hotplug failure gracefully")
+ * cfb52d07f5 ("target/ppc: handle vcpu hotplug failure gracefully")
[ Test Plan ]
- * Setup an IBM Power10 system (with firmware FW1060 or newer,
- that comes with nested KVM support), running Ubuntu Server 24.04.
+ * Setup an IBM Power10 system (with firmware FW1060 or newer,
+ that comes with nested KVM support), running Ubuntu Server 24.04.
- * Install and configure KVM on this system with a (higher)
- maxvcpus value of 128, but have a (smaller) initial value of 4 vCPUs.
- $ virsh define ubu2404.xml
+ * Install and configure KVM on this system with a (higher)
+ maxvcpus value of 128, but have a (smaller) initial value of 4 vCPUs.
+ $ virsh define ubu2404.xml
+ (https://launchpadlibrarian.net/748483993/check.xml)
- * Now after successful definition, start the VM:
- $ virsh start ubu2404 --console
+ * Now after successful definition, start the VM:
+ $ virsh start ubu2404 --console
- * If the VM is up and running increase the vCPUs to a larger value
- here 68:
- $ virsh setvcpus ubu2404 68
+ * If the VM is up and running increase the vCPUs to a larger value
+ here 68:
+ $ virsh setvcpus ubu2404 68
- * A system with an unpatched qemu will crash, showing:
- [ 662.102542] KVM: Create Guest vcpu hcall failed, rc=-44
- error: Unable to read from monitor: Connection reset by peer
+ * A system with an unpatched qemu will crash, showing:
+ [ 662.102542] KVM: Create Guest vcpu hcall failed, rc=-44
+ error: Unable to read from monitor: Connection reset by peer
- * A patches environment will:
- - either just successfully hotplug the new amount (68) of vCPUs
- without further messages
- - or (in case very memory constraint) print a (graceful) error
- message that hotplug couldn't be performed,
- but stays up and running:
- error: internal error: unable to execute QEMU command 'device_add': \
- kvmppc_cpu_realize: vcpu hotplug failed with -12
+ * A patches environment will:
+ - either just successfully hotplug the new amount (68) of vCPUs
+ without further messages
+ - or (in case very memory constraint) print a (graceful) error
+ message that hotplug couldn't be performed,
+ but stays up and running:
+ error: internal error: unable to execute QEMU command 'device_add': \
+ kvmppc_cpu_realize: vcpu hotplug failed with -12
- * Since certain firmware is required, IBM is doing the test and validation
- (and already successfully verified based on the PPA test builds).
+ * Since certain firmware is required, IBM is doing the test and validation
+ (and already successfully verified based on the PPA test builds).
[ Where problems could occur ]
- * All modification were done in target/ppc/kvm.c
- and are with that limited to the IBM Power platform,
- and will not affect other architectures.
+ * All modification were done in target/ppc/kvm.c
+ and are with that limited to the IBM Power platform,
+ and will not affect other architectures.
- * The implementation of the pre-creation of vCPUs (init cpu_target_realize)
- may lead to early failures when a user doesn't expect to have such an
- amount of vCPUs yet.
+ * The implementation of the pre-creation of vCPUs (init cpu_target_realize)
+ may lead to early failures when a user doesn't expect to have such an
+ amount of vCPUs yet.
- * And the pre-creation and especially parking (kvm_create_and_park_vcpu)
- will probably consume more resources than before.
+ * And the pre-creation and especially parking (kvm_create_and_park_vcpu)
+ will probably consume more resources than before.
- * Hence a patched system might run with a reduced max amount of vCPUs,
- but instead will not crash hard, but gracefully fail on lack of resources.
+ * Hence a patched system might run with a reduced max amount of vCPUs,
+ but instead will not crash hard, but gracefully fail on lack of resources.
- * This case and the patch(es) are also discussed in more detail here:
-
https://lore.kernel.org/qemu-devel/[email protected]/T/#t
- and here:
- https://bugzilla.redhat.com/show_bug.cgi?id=2304078
+ * This case and the patch(es) are also discussed in more detail here:
+
https://lore.kernel.org/qemu-devel/[email protected]/T/#t
+ and here:
+ https://bugzilla.redhat.com/show_bug.cgi?id=2304078
[ Other Info ]
- * The code is upstream accepted with qemu v9.1.0(-rc0),
- and the upload to oracular was done,
- and now only noble is affected.
+ * The code is upstream accepted with qemu v9.1.0(-rc0),
+ and the upload to oracular was done,
+ and now only noble is affected.
- * Ubuntu releases older than noble are not affected,
- since (nested) KVM virtualization on P10
- was introduced starting with noble.
+ * Ubuntu releases older than noble are not affected,
+ since (nested) KVM virtualization on P10
+ was introduced starting with noble.
__________
== Comment: #0 - SEETEENA THOUFEEK <[email protected]> - 2024-08-12
03:47:06 ==
+++ This bug was initially created as a clone of Bug #205620 +++
---Problem Description---
cpu hotplug crashes the guest!cpu hotplug crashes the guest!
---Steps to Reproduce---
I have been trying for the CPU hotplugging to the guest with maxvcpus as 128
and current value I am giving as 4! but when I try to hotplug 68 vcpus to the
guest, it crahses and we get error message as:
[ 303.808494] KVM: Create Guest vcpu hcall failed, rc=-44
error: Unable to read from monitor: Connection reset by peer
Steps to reproduce:
1) virsh define bug.xml
2) virsh start Fedora39 --console
3) virsh setvcpus Fedora39 68
Output :
[ 662.102542] KVM: Create Guest vcpu hcall failed, rc=-44
error: Unable to read from monitor: Connection reset by peer
If resources are less, in my thinking it should fail gracefully!
Attaching the XML file that i have used and will post the observations on MDC
system there i saw this same failure on higher number.
fixed with upstream commit
https://github.com/qemu/qemu/commit/cfb52d07f53aa916003d43f69c945c2b42bc6374
Machine Type = na
---Debugger---
A debugger is not configured
Contact Information = [email protected]
---uname output---
NA
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2076587
Title:
cpu hotplug crashes the guest!
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/2076587/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs