On Tue, Aug 26, 2025 at 01:51:40PM +0200, Juraj Marcin wrote:
> From: Juraj Marcin <jmar...@redhat.com>
> 
> Commit 48814111366b ("migration: Always set DEVICE state") introduced
> DEVICE state to postcopy, which moved the actual state transition that
> leads to POSTCOPY_ACTIVE.
> 
> However, the error handling part of the postcopy_start() function still
> expects the state POSTCOPY_ACTIVE, but depending on where an error
> happens, now the state can be either ACTIVE, DEVICE or CANCELLING, but
> never POSTCOPY_ACTIVE, as this transition now happens just before a
> successful return from the function.
> 
> Instead, accept any state except CANCELLING when transitioning to FAILED
> state.
> 
> Cc: qemu-sta...@nongnu.org
> Fixes: 48814111366b ("migration: Always set DEVICE state")
> Signed-off-by: Juraj Marcin <jmar...@redhat.com>

Thanks, Juraj!

Reviewed-by: Peter Xu <pet...@redhat.com>

> 
> ---
> In the RFC[1] where this patch was discussed, there was also a
> suggestion for a helper function migrate_set_failure() that would check
> if the state is not CANCELLING and then set migration error and FAILED
> state. I discussed the implementation with Peter, and we came to a
> conclusion that instead of patching such clean-up on top of the current
> error handling code, it might be more useful to do a larger refactor and
> clean-up of all error handling in the migration code.
> 
> Such clean-up should reduce the number of places where we need to
> explicitly transition to a FAILED state (ideally to one, or only a
> couple of places), and instead only set an appropriate migration error
> using migrate_set_error(). Additionally, it would also refactor
> inappropriate uses of QEMUFile errors where the error is not really an
> error of the underlying channel and migrate_set_error() should be used
> instead.

Fabiano: we discussed something around the FAILED status before as well.
If you started working on something in this area, please shoot!

> 
> [1]: https://lore.kernel.org/all/20250807114922.1013286-3-jmar...@redhat.com/
> ---
>  migration/migration.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 10c216d25d..32b8ce5613 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2872,8 +2872,9 @@ static int postcopy_start(MigrationState *ms, Error 
> **errp)
>  fail_closefb:
>      qemu_fclose(fb);
>  fail:
> -    migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
> -                          MIGRATION_STATUS_FAILED);
> +    if (ms->state != MIGRATION_STATUS_CANCELLING) {
> +        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
> +    }
>      migration_block_activate(NULL);
>      migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
>      bql_unlock();
> -- 
> 2.50.1
> 

-- 
Peter Xu


Reply via email to