On Fri, Jul 03, 2020 at 08:15:44PM +0300, Denis V. Lunev wrote: > On 7/2/20 8:57 PM, Daniel P. Berrangé wrote: > > When QMP was first introduced some 10+ years ago now, the snapshot > > related commands (savevm/loadvm/delvm) were not converted. This was > > primarily because their implementation causes blocking of the thread > > running the monitor commands. This was (and still is) considered > > undesirable behaviour both in HMP and QMP. > > > > In theory someone was supposed to fix this flaw at some point in the > > past 10 years and bring them into the QMP world. Sadly, thus far it > > hasn't happened as people always had more important things to work > > on. Enterprise apps were much more interested in external snapshots > > than internal snapshots as they have many more features. > > > > Meanwhile users still want to use internal snapshots as there is > > a certainly simplicity in having everything self-contained in one > > image, even though it has limitations. Thus the apps that end up > > executing the savevm/loadvm/delvm via the "human-monitor-command" > > QMP command. > > > > > > IOW, the problematic blocking behaviour that was one of the reasons > > for not having savevm/loadvm/delvm in QMP is experienced by applications > > regardless. By not portting the commands to QMP due to one design flaw, > > we've forced apps and users to suffer from other design flaws of HMP ( > > bad error reporting, strong type checking of args, no introspection) for > > an additional 10 years. This feels rather sub-optimal :-( > > > > In practice users don't appear to care strongly about the fact that these > > commands block the VM while they run. I might have seen one bug report > > about it, but it certainly isn't something that comes up as a frequent > > topic except among us QEMU maintainers. Users do care about having > > access to the snapshot feature. > > > > Where I am seeing frequent complaints is wrt the use of OVMF combined > > with snapshots which has some serious pain points. This is getting worse > > as the push to ditch legacy BIOS in favour of UEFI gain momentum both > > across OS vendors and mgmt apps. Solving it requires new parameters to > > the commands, but doing this in HMP is super unappealing. > > > > > > > > After 10 years, I think it is time for us to be a little pragmatic about > > our handling of snapshots commands. My desire is that libvirt should never > > use "human-monitor-command" under any circumstances, because of the > > inherant flaws in HMP as a protocol for machine consumption. If there > > are flaws in QMP commands that's fine. If we fix them in future, we can > > deprecate the current QMP commands and remove them not too long after, > > without being locked in forever. > > > > > > Thus in this series I'm proposing a direct 1-1 mapping of the existing > > HMP commands for savevm/loadvm/delvm into QMP as a first step. This does > > not solve the blocking thread problem, but it does eliminate the error > > reporting, type checking and introspection problems inherant to HMP. > > We're winning on 3 out of the 4 long term problems. > > > > If someone can suggest a easy way to fix the thread blocking problem > > too, I'd be interested to hear it. If it involves a major refactoring > > then I think user are better served by unlocking what look like easy > > wins today. > > > > With a QMP variant, we reasonably deal with the problems related to OVMF: > > > > - The logic to pick which disk to store the vmstate in is not > > satsifactory. > > > > The first block driver state cannot be assumed to be the root disk > > image, it might be OVMF varstore and we don't want to store vmstate > > in there. > > > > - The logic to decide which disks must be snapshotted is hardwired > > to all disks which are writable > > > > Again with OVMF there might be a writable varstore, but this can be > > raw rather than qcow2 format, and thus unable to be snapshotted. > > While users might wish to snapshot their varstore, in some/many/most > > cases it is entirely uneccessary. Users are blocked from snapshotting > > their VM though due to this varstore. > > > > These are solved by adding two parameters to the commands. The first is > > a block device node name that identifies the image to store vmstate in, > > and the second is a list of node names to exclude from snapshots. > > > > In the block code I've only dealt with node names for block devices, as > > IIUC, this is all that libvirt should need in the -blockdev world it now > > lives in. IOW, I've made not attempt to cope with people wanting to use > > these QMP commands in combination with -drive args. > > > > I've done some minimal work in libvirt to start to make use of the new > > commands to validate their functionality, but this isn't finished yet. > > > > My ultimate goal is to make the GNOME Boxes maintainer happy again by > > having internal snapshots work with OVMF: > > > > > > https://gitlab.gnome.org/GNOME/gnome-boxes/-/commit/c486da262f6566326fbcb5e= > > f45c5f64048f16a6e > > > > Daniel P. Berrang=C3=A9 (6): > > migration: improve error reporting of block driver state name > > migration: introduce savevm, loadvm, delvm QMP commands > > block: add ability to filter out blockdevs during snapshot > > block: allow specifying name of block device for vmstate storage > > migration: support excluding block devs in QMP snapshot commands > > migration: support picking vmstate disk in QMP snapshot commands > > > > block/monitor/block-hmp-cmds.c | 4 +- > > block/snapshot.c | 68 +++++++++++++++++++------ > > include/block/snapshot.h | 21 +++++--- > > include/migration/snapshot.h | 10 +++- > > migration/savevm.c | 71 +++++++++++++++++++------- > > monitor/hmp-cmds.c | 20 ++------ > > qapi/migration.json | 91 ++++++++++++++++++++++++++++++++++ > > replay/replay-snapshot.c | 4 +- > > softmmu/vl.c | 2 +- > > 9 files changed, 228 insertions(+), 63 deletions(-) > > I have tried to work in this interface in 2016. That time > we have struggled with the idea that this QMP interface should > be ready to work asynchronously. > > Write-protect userfaultfd was merged into vanilla Linux > thus it is time to async savevm interface, which will also > bring async loadvm and some rework for state storing. > > Thus I think that with the introduction of the QMP interface > we should at least run save VM not from the main > thread but from the background with the event at the end.
spawning a thread in which to invoke save_snapshot() and load_snapshot() is easy enough. I'm not at all clear on what we need in the way of mutex locking though, to make those methods safe to run in a thread that isn't the main event loop. Even with savevm/loadvm being blocking, we could introduce a QMP event straight away, and document that users shouldn't assume the operation is complete until they see the event. That would let us make the commands non-blocking later with same documented semantics. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|