On 7/30/2020 5:39 PM, Paolo Bonzini wrote:
> On 30/07/20 21:09, Steven Sistare wrote:
>>> please spell it out.  Also, how does the functionality compare to
>>> xen-save-devices-state and xen-load-devices-state?
>>
>> qmp_xen_save_devices_state serializes device state to a file which is loaded 
>> on the target for a live migration.  It performs some of the same actions
>> as cprsave/cprload but does not support live update-in-place.
> 
> So it is a subset, can code be reused across both?  

They use common subroutines, but their bodies check different conditions, so I
don't think merging would be an improvement.  We do provide a new helper 
qf_file_open() which could replace a handful of lines in both 
qmp_xen_save_devices_state 
and qmp_xen_load_devices_state.

> Also, live migration
> across versions is supported, so can you describe the special
> update-in-place support more precisely?  I am confused about the use
> cases, which require (or try) to keep file descriptors across re-exec,
> which are for kexec, and so on.

Sure. The first use case allows you to kexec reboot the host and update host
software and/or qemu.  It does not preserve descriptors, and guest ram must be
backed by persistant shared memory.  Guest pause time depends on host reboot
time, which can be seconds to 10's of seconds.

The second case allows you to update qemu in place, but not update the host.
Guest ram can be in shared or anonymous memory.  We call madvise(MADV_DOEXEC)
to tell the kernel to preserve anon memory across the exec.  Open descriptors
are preserved.  Addresses and lengths of saved memory segments are saved in
the environment, and the values of descriptors are saved.  When new qemu
restarts, it finds those values in the environment and uses them when the
various objects are created.  Memory is not realloc'd, it is already present,
and the address and lengths are saved in the ram objects.  Guest pause time
is in the 100 to 200 msec range.  It is less resource intensive than live
migration, and is appropriate if your only goal is to update qemu, as opposed
to evacuating a host.

>>>> cprsave and cprload support guests with vfio devices if the caller first
>>>> suspends the guest by issuing guest-suspend-ram to the qemu guest agent.
>>>> The guest drivers suspend methods flush outstanding requests and re-
>>>> initialize the devices, and thus there is no device state to save and
>>>> restore.
>>> This probably should be allowed even for regular migration.  Can you
>>> generalize the code as a separate series?
>>
>> Maybe.  I think that would be a distinct patch that ignores the vfio 
>> migration blocker 
>> if the state is suspended.  Plus a qemu agent call to do the suspend.  Needs 
>> more
>> thought.
> 
> The agent already supports suspend, so that should be relatively easy.
> Only the code to add/remove the VFIO migration blocker from a VM state
> change notifier, or something like that, would be needed.

Yes, I have experimented with the guest's suspend method.

- Steve

Reply via email to