Peter Xu <pet...@redhat.com> writes: > On Fri, Dec 01, 2023 at 11:23:33AM -0500, Steven Sistare wrote: >> >> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, int >> >> version_id) >> >> return -EINVAL; >> >> } >> >> s->state = r; >> >> + vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED); >> > >> > IIUC current vm_was_suspended (based on my read of your patch) was not the >> > same as a boolean representing "whether VM is suspended", but only a >> > temporary field to remember that for a VM stop request. To be explicit, I >> > didn't see this flag set in qemu_system_suspend() in your previous patch. >> > >> > If so, we can already do: >> > >> > vm_set_suspended(s->vm_was_suspended); >> > >> > Irrelevant of RUN_STATE_SUSPENDED? >> >> We need both terms of the expression. >> >> If the vm *is* suspended (RUN_STATE_SUSPENDED), then vm_was_suspended = >> false. >> We call global_state_store prior to vm_stop_force_state, so the incoming >> side sees s->state = RUN_STATE_SUSPENDED and s->vm_was_suspended = false. > > Right. > >> However, the runstate is RUN_STATE_INMIGRATE. When incoming finishes by >> calling vm_start, we need to restore the suspended state. Thus in >> global_state_post_load, we must set vm_was_suspended = true. > > With above, shouldn't global_state_get_runstate() (on dest) fetch SUSPENDED > already? Then I think it should call vm_start(SUSPENDED) if to start. > > Maybe you're talking about the special case where autostart==false? We > used to have this (existing process_incoming_migration_bh()): > > if (!global_state_received() || > global_state_get_runstate() == RUN_STATE_RUNNING) { > if (autostart) { > vm_start(); > } else { > runstate_set(RUN_STATE_PAUSED); > } > } > > If so maybe I get you, because in the "else" path we do seem to lose the > SUSPENDED state again, but in that case IMHO we should logically set > vm_was_suspended only when we "lose" it - we didn't lose it during > migration, but only until we decided to switch to PAUSED (due to > autostart==false). IOW, change above to something like: > > state = global_state_get_runstate(); > if (!global_state_received() || runstate_is_alive(state)) { > if (autostart) { > vm_start(state); > } else { > if (runstate_is_suspended(state)) { > /* Remember suspended state before setting system to STOPed */ > vm_was_suspended = true; > } > runstate_set(RUN_STATE_PAUSED); > } > } > > It may or may not have a functional difference even if current patch, > though. However maybe clearer to follow vm_was_suspended's strict > definition. > >> >> If the vm *was* suspended, but is currently stopped (eg RUN_STATE_PAUSED), >> then vm_was_suspended = true. Migration from that state sets >> vm_was_suspended = s->vm_was_suspended = true in global_state_post_load and >> ends with runstate_set(RUN_STATE_PAUSED). >> >> I will add a comment here in the code. >> >> >> return 0; >> >> } >> >> @@ -134,6 +143,7 @@ static const VMStateDescription vmstate_globalstate = >> >> { >> >> .fields = (VMStateField[]) { >> >> VMSTATE_UINT32(size, GlobalState), >> >> VMSTATE_BUFFER(runstate, GlobalState), >> >> + VMSTATE_BOOL(vm_was_suspended, GlobalState), >> >> VMSTATE_END_OF_LIST() >> >> }, >> >> }; >> > >> > I think this will break migration between old/new, unfortunately. And >> > since the global state exist mostly for every VM, all VM setup should be >> > affected, and over all archs. >> >> Thanks, I keep forgetting that my binary tricks are no good here. However, >> I have one other trick up my sleeve, which is to store vm_was_running in >> global_state.runstate[strlen(runstate) + 2]. It is forwards and backwards >> compatible, since that byte is always 0 in older qemu. It can be implemented >> with a few lines of code change confined to global_state.c, versus many >> lines >> spread across files to do it the conventional way using a compat property and >> a subsection. Sound OK? > > Tricky! But sounds okay to me. I think you're inventing some of your own > way of being compatible, not relying on machine type as a benefit. If go > this route please document clearly on the layout and also what it looked > like in old binaries. > > I think maybe it'll be good to keep using strings, so in the new binaries > we allow >1 strings, then we define properly on those strings (index 0: > runstate, existed since start; index 2: suspended, perhaps using "1"/"0" to > express, while 0x00 means old binary, etc.). > > I hope this trick will need less code than the subsection solution, > otherwise I'd still consider going with that, which is the "common > solution". > > Let's also see whether Juan/Fabiano/others has any opinions.
Can't we pack the structure and just go ahead and slash 'runstate' in half? That would claim some unused bytes for future backward compatibility issues.