On 20/08/2025 17.07, Fabiano Rosas wrote:
Thomas Huth <th...@redhat.com> writes:

On 20/08/2025 00.39, Fabiano Rosas wrote:
The commit referenced (from QEMU 10.0) has changed the way the pseries
machine marks a cpu as quiesced. Previously, the cpu->halted value
from QEMU common cpu code was (incorrectly) used. With the fix, the
env->quiesced variable starts being used, which improves on the
original situation, but also causes a side effect after migration:

The env->quiesced is set at reset and never migrated, which causes the
destination QEMU to stop delivering interrupts and hang the machine.

To fix the issue from this point on, start migrating the env->quiesced
value.

For QEMU versions < 10.0, sending the new element on the stream would
cause migration to be aborted, so add the appropriate compatibility
property to omit the new subsection.

Independently of this patch, all migrations from QEMU versions < 10.0
will result in a hang since the older QEMU never migrates
env->quiesced. This is bad because it leaves machines already running
on the old QEMU without a migration path into newer versions.

As a workaround, clear env->quiesced in the new QEMU whenever
cpu->halted is also clear. This assumes rtas_stop_self() always sets
both flags at the same time. Migrations during secondaries bringup
(i.e. before rtas-start-cpu) will still cause a hang, but those are
early enough that requiring reboot would not be unreasonable.

Note that this was tested with -cpu power9 and -machine ic-mode=xive
due to another bug affecting migration of XICS guests. Tested both
forward and backward migration and savevm/loadvm from 9.2 and 10.0.

Reported-by: Fabian Vogt <fv...@suse.de>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3079
Fixes: fb802acdc8b ("ppc/spapr: Fix RTAS stopped state")
Signed-off-by: Fabiano Rosas <faro...@suse.de>
---
The choice of PowerPCCPU to hold the compat property is dubious. This
only affects pseries, but it seems like a layering violation to access
SpaprMachine from target/ppc/, suggestions welcome.
---
   hw/core/machine.c     |  1 +
   target/ppc/cpu.h      |  1 +
   target/ppc/cpu_init.c |  7 +++++++
   target/ppc/machine.c  | 40 ++++++++++++++++++++++++++++++++++++++++
   4 files changed, 49 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index bd47527479..ea83c0876b 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -42,6 +42,7 @@ GlobalProperty hw_compat_10_0[] = {
       { "vfio-pci", "x-migration-load-config-after-iter", "off" },
       { "ramfb", "use-legacy-x86-rom", "true"},
       { "vfio-pci-nohotplug", "use-legacy-x86-rom", "true" },
+    { "powerpc64-cpu", "rtas-stopped-state", "false" },

This is specific to ppc, so it should not go into the generic hw_compat_* array.


So arm-cpu in hw_compat_9_2 should not be there?

Right, this should get moved to the code in hw/arm/virt.c.

Same for arm-cpu in hw_compat_9_0 and for arm-gicv3-common in hw_compat_7_0.

 Thomas


Reply via email to