On 11/30/2023 6:06 PM, Peter Xu wrote: > On Thu, Nov 30, 2023 at 01:37:18PM -0800, Steve Sistare wrote: >> If the outgoing machine was previously suspended, propagate that to the >> incoming side via global_state, so a subsequent vm_start restores the >> suspended state. To maintain backward and forward compatibility, define >> the new field in a zero'd hole in the GlobalState struct. >> >> Signed-off-by: Steve Sistare <steven.sist...@oracle.com> >> --- >> migration/global_state.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/migration/global_state.c b/migration/global_state.c >> index 4e2a9d8..de2532c 100644 >> --- a/migration/global_state.c >> +++ b/migration/global_state.c >> @@ -25,6 +25,7 @@ typedef struct { >> uint8_t runstate[100]; >> RunState state; >> bool received; >> + bool vm_was_suspended; >> } GlobalState; >> >> static GlobalState global_state; >> @@ -35,6 +36,7 @@ static void global_state_do_store(RunState state) >> assert(strlen(state_str) < sizeof(global_state.runstate)); >> strpadcpy((char *)global_state.runstate, sizeof(global_state.runstate), >> state_str, '\0'); >> + global_state.vm_was_suspended = vm_get_suspended(); >> } >> >> void global_state_store(void) >> @@ -68,6 +70,12 @@ static bool global_state_needed(void *opaque) >> return true; >> } >> >> + /* If the suspended state must be remembered, it is needed */ >> + >> + if (vm_get_suspended()) { >> + return true; >> + } >> + >> /* If state is running or paused, it is not needed */ >> >> if (strcmp(runstate, "running") == 0 || >> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, int >> version_id) >> return -EINVAL; >> } >> s->state = r; >> + vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED); > > IIUC current vm_was_suspended (based on my read of your patch) was not the > same as a boolean representing "whether VM is suspended", but only a > temporary field to remember that for a VM stop request. To be explicit, I > didn't see this flag set in qemu_system_suspend() in your previous patch. > > If so, we can already do: > > vm_set_suspended(s->vm_was_suspended); > > Irrelevant of RUN_STATE_SUSPENDED?
We need both terms of the expression. If the vm *is* suspended (RUN_STATE_SUSPENDED), then vm_was_suspended = false. We call global_state_store prior to vm_stop_force_state, so the incoming side sees s->state = RUN_STATE_SUSPENDED and s->vm_was_suspended = false. However, the runstate is RUN_STATE_INMIGRATE. When incoming finishes by calling vm_start, we need to restore the suspended state. Thus in global_state_post_load, we must set vm_was_suspended = true. If the vm *was* suspended, but is currently stopped (eg RUN_STATE_PAUSED), then vm_was_suspended = true. Migration from that state sets vm_was_suspended = s->vm_was_suspended = true in global_state_post_load and ends with runstate_set(RUN_STATE_PAUSED). I will add a comment here in the code. >> return 0; >> } >> @@ -134,6 +143,7 @@ static const VMStateDescription vmstate_globalstate = { >> .fields = (VMStateField[]) { >> VMSTATE_UINT32(size, GlobalState), >> VMSTATE_BUFFER(runstate, GlobalState), >> + VMSTATE_BOOL(vm_was_suspended, GlobalState), >> VMSTATE_END_OF_LIST() >> }, >> }; > > I think this will break migration between old/new, unfortunately. And > since the global state exist mostly for every VM, all VM setup should be > affected, and over all archs. Thanks, I keep forgetting that my binary tricks are no good here. However, I have one other trick up my sleeve, which is to store vm_was_running in global_state.runstate[strlen(runstate) + 2]. It is forwards and backwards compatible, since that byte is always 0 in older qemu. It can be implemented with a few lines of code change confined to global_state.c, versus many lines spread across files to do it the conventional way using a compat property and a subsection. Sound OK? > We used to have the version_id field right above for adding fields, but I > _think_ that will still break backward migration fron new->old binary, so > not wanted. Juan can keep me honest. > > The best thing is still machine compat properties, afaict, to fix. It's > slightly involved, but let me attach a sample diff for you (at the end, > possibly working with your current patch kind-of squashed, but not ever > tested), hopefully make it slightly easier. > > I'm wondering how bad it is to just ignore it, it's not as bad as if we > don't fix stop-during-suspend, in this case the worst case of forgetting > this field over migration is: if VM stopped (and used to be suspended) then > after migration it'll keep being stopped, however after "cont" it'll forget > the suspended state. Cont changes state to running, but the VM is broken because wake was never called. > Not that bad! IIUC SPR should always migrate with > suspended (rather than any fully stopped state), right? Then shouldn't be > affected. If risk is low, maybe we can leave this one for later? > > Thanks, Thanks for the patch. I am going to save this as a template for adding compatible subsections in future work. - Steve > ===8<=== > > diff --git a/migration/migration.h b/migration/migration.h > index cf2c9c88e0..c3fd1f8347 100644 > --- a/migration/migration.h > +++ b/migration/migration.h > @@ -470,6 +470,8 @@ struct MigrationState { > bool switchover_acked; > /* Is this a rdma migration */ > bool rdma_migration; > + /* Whether remember global vm_was_suspended field? */ > + bool store_vm_was_suspended; > }; > > void migrate_set_state(int *state, int old_state, int new_state); > diff --git a/hw/core/machine.c b/hw/core/machine.c > index 0c17398141..365e01c1c9 100644 > --- a/hw/core/machine.c > +++ b/hw/core/machine.c > @@ -37,6 +37,7 @@ GlobalProperty hw_compat_8_1[] = { > { "ramfb", "x-migrate", "off" }, > { "vfio-pci-nohotplug", "x-ramfb-migrate", "off" }, > { "igb", "x-pcie-flr-init", "off" }, > + { "migration", "store-vm-was-suspended", false }, > }; > const size_t hw_compat_8_1_len = G_N_ELEMENTS(hw_compat_8_1); > > diff --git a/migration/global_state.c b/migration/global_state.c > index 4e2a9d8ec0..ffa7bf82ca 100644 > --- a/migration/global_state.c > +++ b/migration/global_state.c > @@ -25,6 +25,7 @@ typedef struct { > uint8_t runstate[100]; > RunState state; > bool received; > + bool vm_was_suspended; > } GlobalState; > > static GlobalState global_state; > @@ -124,6 +125,25 @@ static int global_state_pre_save(void *opaque) > return 0; > } > > +static bool global_state_has_vm_was_suspended(void *opaque) > +{ > + return migrate_get_current()->store_vm_was_suspended; > +} > + > +static const VMStateDescription vmstate_vm_was_suspended = { > + .name = "globalstate/vm_was_suspended", > + .version_id = 1, > + .minimum_version_id = 1, > + /* TODO: Fill these in */ > + .pre_save = NULL, > + .post_load = NULL, > + .needed = global_state_has_vm_was_suspended, > + .fields = (VMStateField[]) { > + VMSTATE_BOOL(vm_was_suspended, GlobalState), > + VMSTATE_END_OF_LIST() > + }, > +}; > + > static const VMStateDescription vmstate_globalstate = { > .name = "globalstate", > .version_id = 1, > @@ -136,6 +156,10 @@ static const VMStateDescription vmstate_globalstate = { > VMSTATE_BUFFER(runstate, GlobalState), > VMSTATE_END_OF_LIST() > }, > + .subsections = (const VMStateDescription*[]) { > + &vmstate_vm_was_suspended, > + NULL > + }, > }; > > void register_global_state(void) > diff --git a/migration/options.c b/migration/options.c > index 8d8ec73ad9..5f9998d3e8 100644 > --- a/migration/options.c > +++ b/migration/options.c > @@ -88,6 +88,8 @@ > Property migration_properties[] = { > DEFINE_PROP_BOOL("store-global-state", MigrationState, > store_global_state, true), > + DEFINE_PROP_BOOL("store-vm-was-suspended", MigrationState, > + store_vm_was_suspended, true), > DEFINE_PROP_BOOL("send-configuration", MigrationState, > send_configuration, true), > DEFINE_PROP_BOOL("send-section-footer", MigrationState, >