On Fri, Mar 15, 2024 at 01:20:49PM +0100, Cédric Le Goater wrote:
> On 3/15/24 12:01, Peter Xu wrote:
> > On Fri, Mar 15, 2024 at 11:17:45AM +0100, Cédric Le Goater wrote:
> > > > migrate_set_state is also unintuitive because it ignores invalid state
> > > > transitions and we've been using that property to deal with special
> > > > states such as POSTCOPY_PAUSED and FAILED:
> > > > 
> > > > - After the migration goes into POSTCOPY_PAUSED, the resumed migration's
> > > >     migrate_init() will try to set the state NONE->SETUP, which is not
> > > >     valid.
> > > > 
> > > > - After save_setup fails, the migration goes into FAILED, but 
> > > > wait_unplug
> > > >     will try to transition SETUP->ACTIVE, which is also not valid.
> > > > 
> > > 
> > > I am not sure I understand what the plan is. Both solutions are 
> > > problematic
> > > regarding the state transitions.
> > > 
> > > Should we consider that waiting for failover devices to unplug is an 
> > > internal
> > > step of the SETUP phase not transitioning to ACTIVE ?
> > 
> > If to unblock this series, IIUC the simplest solution is to do what Fabiano
> > suggested, that we move qemu_savevm_wait_unplug() to be before the check of
> > setup() ret.
> 
> The simplest is IMHO moving qemu_savevm_wait_unplug() before
> qemu_savevm_state_setup() and leave patch 10 is unchanged. See
> below the extra patch. It looks much cleaner than what we have
> today.

Yes it looks cleaner indeed, it's just that then we'll have one more
possible state conversions like SETUP->UNPLUG->SETUP.  I'd say it's fine,
but let's also copy Laruent and Laine if it's going to be posted formally.

Thanks,

> 
> > In that case, the state change in qemu_savevm_wait_unplug()
> > should be benign and we should see a super small window it became ACTIVE
> > but then it should be FAILED (and IIUC the patch itself will need to use
> > ACTIVE as "old_state", not SETUP anymore).
> 
> OK. I will give it a try to compare.
> 
> > For the long term, maybe we can remove the WAIT_UNPLUG state?
> 
> I hope so, it's an internal SETUP state for me.
> 
> > The only Libvirt support seems to be here:
> > 
> > commit 8a226ddb3602586a2ba2359afc4448c02f566a0e
> > Author: Laine Stump <la...@redhat.com>
> > Date:   Wed Jan 15 16:38:57 2020 -0500
> > 
> >      qemu: add wait-unplug to qemu migration status enum
> > 
> > Considering that qemu_savevm_wait_unplug() can be a noop if the device is
> > already unplugged, I think it means no upper layer app should rely on this
> > state to present.
> 
> Thanks,
> 
> C.
> 
> 
> > 
> @@ -3383,11 +3383,10 @@ bool migration_rate_limit(void)
>   * unplugged
>   */
> -static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
> -                                    int new_state)
> +static void qemu_savevm_wait_unplug(MigrationState *s, int state)
>  {
>      if (qemu_savevm_state_guest_unplug_pending()) {
> -        migrate_set_state(&s->state, old_state, 
> MIGRATION_STATUS_WAIT_UNPLUG);
> +        migrate_set_state(&s->state, state, MIGRATION_STATUS_WAIT_UNPLUG);
>          while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
>                 qemu_savevm_state_guest_unplug_pending()) {
> @@ -3410,9 +3409,7 @@ static void qemu_savevm_wait_unplug(Migr
>              }
>          }
> -        migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG, 
> new_state);
> -    } else {
> -        migrate_set_state(&s->state, old_state, new_state);
> +        migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG, state);
>      }
>  }
> @@ -3469,17 +3466,19 @@ static void *migration_thread(void *opaq
>          qemu_savevm_send_colo_enable(s->to_dst_file);
>      }
> +    qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP);
> +
>      bql_lock();
>      qemu_savevm_state_setup(s->to_dst_file);
>      bql_unlock();
> -    qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
> -                               MIGRATION_STATUS_ACTIVE);
> -
>      s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
>      trace_migration_thread_setup_complete();
> +    migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> +                      MIGRATION_STATUS_ACTIVE);
> +
>      while (migration_is_active()) {
>          if (urgent || !migration_rate_exceeded(s->to_dst_file)) {
>              MigIterateState iter_state = migration_iteration_run(s);
> @@ -3580,18 +3579,20 @@ static void *bg_migration_thread(void *o
>      ram_write_tracking_prepare();
>  #endif
> +    qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP);
> +
>      bql_lock();
>      qemu_savevm_state_header(s->to_dst_file);
>      qemu_savevm_state_setup(s->to_dst_file);
>      bql_unlock();
> -    qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
> -                               MIGRATION_STATUS_ACTIVE);
> -
>      s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
>      trace_migration_thread_setup_complete();
> +    migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> +                      MIGRATION_STATUS_ACTIVE);
> +
>      bql_lock();
>      if (migration_stop_vm(s, RUN_STATE_PAUSED)) {
> 

-- 
Peter Xu


Reply via email to