From: Juraj Marcin <jmar...@redhat.com>

Commit 48814111366b ("migration: Always set DEVICE state") introduced
DEVICE state to postcopy, which moved the actual state transition that
leads to POSTCOPY_ACTIVE.

However, the error handling part of the postcopy_start() function still
expects the state POSTCOPY_ACTIVE, but depending on where an error
happens, now the state can be either ACTIVE, DEVICE or CANCELLING, but
never POSTCOPY_ACTIVE, as this transition now happens just before a
successful return from the function.

Instead, accept any state except CANCELLING when transitioning to FAILED
state.

Cc: qemu-sta...@nongnu.org
Fixes: 48814111366b ("migration: Always set DEVICE state")
Signed-off-by: Juraj Marcin <jmar...@redhat.com>

---
In the RFC[1] where this patch was discussed, there was also a
suggestion for a helper function migrate_set_failure() that would check
if the state is not CANCELLING and then set migration error and FAILED
state. I discussed the implementation with Peter, and we came to a
conclusion that instead of patching such clean-up on top of the current
error handling code, it might be more useful to do a larger refactor and
clean-up of all error handling in the migration code.

Such clean-up should reduce the number of places where we need to
explicitly transition to a FAILED state (ideally to one, or only a
couple of places), and instead only set an appropriate migration error
using migrate_set_error(). Additionally, it would also refactor
inappropriate uses of QEMUFile errors where the error is not really an
error of the underlying channel and migrate_set_error() should be used
instead.

[1]: https://lore.kernel.org/all/20250807114922.1013286-3-jmar...@redhat.com/
---
 migration/migration.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 10c216d25d..32b8ce5613 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2872,8 +2872,9 @@ static int postcopy_start(MigrationState *ms, Error 
**errp)
 fail_closefb:
     qemu_fclose(fb);
 fail:
-    migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
-                          MIGRATION_STATUS_FAILED);
+    if (ms->state != MIGRATION_STATUS_CANCELLING) {
+        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
+    }
     migration_block_activate(NULL);
     migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
     bql_unlock();
-- 
2.50.1


Reply via email to