On 11/30/2023 6:06 PM, Peter Xu wrote:
> On Thu, Nov 30, 2023 at 01:37:18PM -0800, Steve Sistare wrote:
>> If the outgoing machine was previously suspended, propagate that to the
>> incoming side via global_state, so a subsequent vm_start restores the
>> suspended state.  To maintain backward and forward compatibility, define
>> the new field in a zero'd hole in the GlobalState struct.
>>
>> Signed-off-by: Steve Sistare <steven.sist...@oracle.com>
>> ---
>>  migration/global_state.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/migration/global_state.c b/migration/global_state.c
>> index 4e2a9d8..de2532c 100644
>> --- a/migration/global_state.c
>> +++ b/migration/global_state.c
>> @@ -25,6 +25,7 @@ typedef struct {
>>      uint8_t runstate[100];
>>      RunState state;
>>      bool received;
>> +    bool vm_was_suspended;
>>  } GlobalState;
>>  
>>  static GlobalState global_state;
>> @@ -35,6 +36,7 @@ static void global_state_do_store(RunState state)
>>      assert(strlen(state_str) < sizeof(global_state.runstate));
>>      strpadcpy((char *)global_state.runstate, sizeof(global_state.runstate),
>>                state_str, '\0');
>> +    global_state.vm_was_suspended = vm_get_suspended();
>>  }
>>  
>>  void global_state_store(void)
>> @@ -68,6 +70,12 @@ static bool global_state_needed(void *opaque)
>>          return true;
>>      }
>>  
>> +    /* If the suspended state must be remembered, it is needed */
>> +
>> +    if (vm_get_suspended()) {
>> +        return true;
>> +    }
>> +
>>      /* If state is running or paused, it is not needed */
>>  
>>      if (strcmp(runstate, "running") == 0 ||
>> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, int 
>> version_id)
>>          return -EINVAL;
>>      }
>>      s->state = r;
>> +    vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED);
> 
> IIUC current vm_was_suspended (based on my read of your patch) was not the
> same as a boolean representing "whether VM is suspended", but only a
> temporary field to remember that for a VM stop request.  To be explicit, I
> didn't see this flag set in qemu_system_suspend() in your previous patch.
> 
> If so, we can already do:
> 
>   vm_set_suspended(s->vm_was_suspended);
> 
> Irrelevant of RUN_STATE_SUSPENDED?

We need both terms of the expression.

If the vm *is* suspended (RUN_STATE_SUSPENDED), then vm_was_suspended = false.
We call global_state_store prior to vm_stop_force_state, so the incoming
side sees s->state = RUN_STATE_SUSPENDED and s->vm_was_suspended = false.
However, the runstate is RUN_STATE_INMIGRATE.  When incoming finishes by
calling vm_start, we need to restore the suspended state.  Thus in 
global_state_post_load, we must set vm_was_suspended = true.

If the vm *was* suspended, but is currently stopped (eg RUN_STATE_PAUSED),
then vm_was_suspended = true.  Migration from that state sets
vm_was_suspended = s->vm_was_suspended = true in global_state_post_load and 
ends with runstate_set(RUN_STATE_PAUSED).

I will add a comment here in the code.
 
>>      return 0;
>>  }
>> @@ -134,6 +143,7 @@ static const VMStateDescription vmstate_globalstate = {
>>      .fields = (VMStateField[]) {
>>          VMSTATE_UINT32(size, GlobalState),
>>          VMSTATE_BUFFER(runstate, GlobalState),
>> +        VMSTATE_BOOL(vm_was_suspended, GlobalState),
>>          VMSTATE_END_OF_LIST()
>>      },
>>  };
> 
> I think this will break migration between old/new, unfortunately.  And
> since the global state exist mostly for every VM, all VM setup should be
> affected, and over all archs.

Thanks, I keep forgetting that my binary tricks are no good here.  However,
I have one other trick up my sleeve, which is to store vm_was_running in
global_state.runstate[strlen(runstate) + 2].  It is forwards and backwards
compatible, since that byte is always 0 in older qemu.  It can be implemented
with a few lines of code change confined to global_state.c, versus many lines 
spread across files to do it the conventional way using a compat property and
a subsection.  Sound OK?  

> We used to have the version_id field right above for adding fields, but I
> _think_ that will still break backward migration fron new->old binary, so
> not wanted.  Juan can keep me honest.
> 
> The best thing is still machine compat properties, afaict, to fix.  It's
> slightly involved, but let me attach a sample diff for you (at the end,
> possibly working with your current patch kind-of squashed, but not ever
> tested), hopefully make it slightly easier.
> 
> I'm wondering how bad it is to just ignore it, it's not as bad as if we
> don't fix stop-during-suspend, in this case the worst case of forgetting
> this field over migration is: if VM stopped (and used to be suspended) then
> after migration it'll keep being stopped, however after "cont" it'll forget
> the suspended state.  

Cont changes state to running, but the VM is broken because wake was never 
called.

> Not that bad!  IIUC SPR should always migrate with
> suspended (rather than any fully stopped state), right?  Then shouldn't be
> affected.  If risk is low, maybe we can leave this one for later?
> 
> Thanks,

Thanks for the patch.  I am going to save this as a template for adding 
compatible
subsections in future work.

- Steve

> ===8<===
> 
> diff --git a/migration/migration.h b/migration/migration.h
> index cf2c9c88e0..c3fd1f8347 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -470,6 +470,8 @@ struct MigrationState {
>      bool switchover_acked;
>      /* Is this a rdma migration */
>      bool rdma_migration;
> +    /* Whether remember global vm_was_suspended field? */
> +    bool store_vm_was_suspended;
>  };
>  
>  void migrate_set_state(int *state, int old_state, int new_state);
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 0c17398141..365e01c1c9 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -37,6 +37,7 @@ GlobalProperty hw_compat_8_1[] = {
>      { "ramfb", "x-migrate", "off" },
>      { "vfio-pci-nohotplug", "x-ramfb-migrate", "off" },
>      { "igb", "x-pcie-flr-init", "off" },
> +    { "migration", "store-vm-was-suspended", false },
>  };
>  const size_t hw_compat_8_1_len = G_N_ELEMENTS(hw_compat_8_1);
>  
> diff --git a/migration/global_state.c b/migration/global_state.c
> index 4e2a9d8ec0..ffa7bf82ca 100644
> --- a/migration/global_state.c
> +++ b/migration/global_state.c
> @@ -25,6 +25,7 @@ typedef struct {
>      uint8_t runstate[100];
>      RunState state;
>      bool received;
> +    bool vm_was_suspended;
>  } GlobalState;
>  
>  static GlobalState global_state;
> @@ -124,6 +125,25 @@ static int global_state_pre_save(void *opaque)
>      return 0;
>  }
>  
> +static bool global_state_has_vm_was_suspended(void *opaque)
> +{
> +    return migrate_get_current()->store_vm_was_suspended;
> +}
> +
> +static const VMStateDescription vmstate_vm_was_suspended = {
> +    .name = "globalstate/vm_was_suspended",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    /* TODO: Fill these in */
> +    .pre_save = NULL,
> +    .post_load = NULL,
> +    .needed = global_state_has_vm_was_suspended,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_BOOL(vm_was_suspended, GlobalState),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
>  static const VMStateDescription vmstate_globalstate = {
>      .name = "globalstate",
>      .version_id = 1,
> @@ -136,6 +156,10 @@ static const VMStateDescription vmstate_globalstate = {
>          VMSTATE_BUFFER(runstate, GlobalState),
>          VMSTATE_END_OF_LIST()
>      },
> +    .subsections = (const VMStateDescription*[]) {
> +        &vmstate_vm_was_suspended,
> +        NULL
> +    },
>  };
>  
>  void register_global_state(void)
> diff --git a/migration/options.c b/migration/options.c
> index 8d8ec73ad9..5f9998d3e8 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -88,6 +88,8 @@
>  Property migration_properties[] = {
>      DEFINE_PROP_BOOL("store-global-state", MigrationState,
>                       store_global_state, true),
> +    DEFINE_PROP_BOOL("store-vm-was-suspended", MigrationState,
> +                     store_vm_was_suspended, true),
>      DEFINE_PROP_BOOL("send-configuration", MigrationState,
>                       send_configuration, true),
>      DEFINE_PROP_BOOL("send-section-footer", MigrationState,
> 

Reply via email to