On 3/15/24 15:52, Peter Xu wrote:
On Fri, Mar 15, 2024 at 03:21:27PM +0100, Cédric Le Goater wrote:
On 3/15/24 13:20, Cédric Le Goater wrote:
On 3/15/24 12:01, Peter Xu wrote:
On Fri, Mar 15, 2024 at 11:17:45AM +0100, Cédric Le Goater wrote:
migrate_set_state is also unintuitive because it ignores invalid state
transitions and we've been using that property to deal with special
states such as POSTCOPY_PAUSED and FAILED:

- After the migration goes into POSTCOPY_PAUSED, the resumed migration's
     migrate_init() will try to set the state NONE->SETUP, which is not
     valid.

- After save_setup fails, the migration goes into FAILED, but wait_unplug
     will try to transition SETUP->ACTIVE, which is also not valid.


I am not sure I understand what the plan is. Both solutions are problematic
regarding the state transitions.

Should we consider that waiting for failover devices to unplug is an internal
step of the SETUP phase not transitioning to ACTIVE ?

If to unblock this series, IIUC the simplest solution is to do what Fabiano
suggested, that we move qemu_savevm_wait_unplug() to be before the check of
setup() ret.

The simplest is IMHO moving qemu_savevm_wait_unplug() before
qemu_savevm_state_setup() and leave patch 10 is unchanged. See
below the extra patch. It looks much cleaner than what we have
today.

In that case, the state change in qemu_savevm_wait_unplug()
should be benign and we should see a super small window it became ACTIVE
but then it should be FAILED (and IIUC the patch itself will need to use
ACTIVE as "old_state", not SETUP anymore).

OK. I will give it a try to compare.

Here's the alternative solution. SETUP state failures are handled after
transitioning to ACTIVE state, which is unfortunate but probably harmless.
I guess it's OK.

This also looks good to me, thanks.

One trivial early comment is in this case we can introduce a helper to
cover both setup() calls and UNPLUG waits and dedup the two paths.

There is one little difference: qemu_savevm_state_header() is called
earlier in the migration thread, before return-path, postcopy and colo
are advertised on the target. I don't think it can it be moved.

Thanks,

C.


Reply via email to