Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-09-01 Thread Kevin Wolf
Am 01.09.2020 um 15:22 hat Markus Armbruster geschrieben:
> Daniel P. Berrangé  writes:
> 
> > On Thu, Aug 27, 2020 at 01:04:43PM +0200, Markus Armbruster wrote:
> >> Daniel P. Berrangé  writes:
> >> 
> >> > On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
> >> > From the POV of practicality, making a design that unifies internal
> >> > and external snapshots is something I'm considering out of scope.
> >> > It increases the design time burden, as well as implementation burden.
> >> > On my side, improving internal snapshots is a "spare time" project,
> >> > not something I can justify spending weeks or months on.
> >> 
> >> I'm not demanding a solution that unifies internal and external
> >> snapshots.  I'm asking for a bit of serious thought on an interface that
> >> could compatibly evolve there.  Hours, not weeks or months.
> >> 
> >> > My goal is to implement something that is achievable in a short
> >> > amount of time that gets us out of the hole we've been in for 10
> >> > years. Minimal refactoring of the internal snapshot code aside
> >> > from fixing the critical limitations we have today around choice
> >> > of disks to snapshot.
> >> >
> >> > If someone later wants to come up with a grand unified design
> >> > for everything that's fine, we can deprecate the new QMP commands
> >> > I'm proposing now.
> >> 
> >> Failing at coming up with an interface that has a reasonable chance to
> >> be future-proof is okay.
> >> 
> >> Not even trying is not okay.
> >
> > This was raised in my v1 posting:
> >
> >   https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg01346.html
> >
> > but the conclusion was that it was a non-trivial amount of extra
> > implementation work
> 
> Thanks for the pointer.  I've now read that review thread.
> 
> >> Specifically, I'd like you to think about monolothic snapshot command
> >> (internal snapshots only by design) vs. transaction of individual
> >> snapshot commands (design is not restricted to internal snapshots, but
> >> we may want to accept implementation restrictions).
> >> 
> >> We already have transactionable individual storage snapshot commands.
> >> What's missing is a transactionable machine state snapshot command.
> >
> > At a high level I consider what I've proposed as being higher level
> > syntax sugar vs a more generic future impl based on multiple commands
> > for snapshotting disk & state. I don't think I'd claim that it will
> > evolve to become the design you're suggesting here, as they are designs
> > from different levels in the stack. IOW, I think the would ultimately
> > just exist in parallel. I don't think that's a real problem from a
> > maint POV, as the large burden from the monolithic snapshot code is
> > not the HMP/QMP interface, but rather the guts of the internal
> > impl in the migration/savevm.c and block/snapshot.c files. That code
> > will exist for as long as the HMP commands exist, and adding a QMP
> > interface doesn't make that situation worse unless we were intending
> > to drop the existing HMP commands. If we did change our minds though,
> > we can just deprecate the QMP command at any time we like.
> >
> >
> >> One restriction I'd readily accept at this time is "the machine state
> >> snapshot must write to a QCOW2 that is also internally snapshot in the
> >> same transaction".
> >> 
> >> Now explain to me why this is impractical.
> >
> > The issues were described by Kevin here:
> >
> >   https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg02057.html
> >
> > Assuming the migration impl is actually possible to solve, there is
> > still the question of actually writing it. That's a non-trivial
> > amount of work someone has to find time for.
> 
> Kevin explained how the transactionable machine state snapshot command
> should be made non-blocking: post-copy.
> 
> I don't dispute that creating such a post-copy snapshot is a non-trivial
> task.  It is out of reach for you and me.  I didn't actually ask for it,
> though.
> 
> You argue that providing a blocking snapshot in QMP is better than
> nothing, and good enough for quite a few applications.  I agree!  I
> blocked prior attempts at porting HMP's savevm/loadvm to QMP not because
> they were blocking, but because they stuck to the HMP interface, and the
> HMP interface is bonkers.  I would accept the restriction "snapshotting
> machine state is blocking, i.e. it stops the machine."  As I wrote in
> 2016, "Limitations: No live internal machine snapshot, yet."
> 
> Aside: unless I'm mistaken, taking an internal block device snapshot is
> also blocking, but unlike taking a machine state snapshot, it's fast
> enough for the blocking not to matter.  That's the "sync" in
> blockdev-snapshot-internal-sync.
> 
> I asked you to consider the interface design I proposed back in 2016.
> You point out above that your interface is more high-level, and could be
> turned into sugar for a low level interface.
> 
> If true, this means your proposal doesn't box us 

Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-09-01 Thread Daniel P . Berrangé
On Tue, Sep 01, 2020 at 03:22:24PM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé  writes:
> 
> > On Thu, Aug 27, 2020 at 01:04:43PM +0200, Markus Armbruster wrote:
> >> Daniel P. Berrangé  writes:
> >> 
> >> > On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
> >> > From the POV of practicality, making a design that unifies internal
> >> > and external snapshots is something I'm considering out of scope.
> >> > It increases the design time burden, as well as implementation burden.
> >> > On my side, improving internal snapshots is a "spare time" project,
> >> > not something I can justify spending weeks or months on.
> >> 
> >> I'm not demanding a solution that unifies internal and external
> >> snapshots.  I'm asking for a bit of serious thought on an interface that
> >> could compatibly evolve there.  Hours, not weeks or months.
> >> 
> >> > My goal is to implement something that is achievable in a short
> >> > amount of time that gets us out of the hole we've been in for 10
> >> > years. Minimal refactoring of the internal snapshot code aside
> >> > from fixing the critical limitations we have today around choice
> >> > of disks to snapshot.
> >> >
> >> > If someone later wants to come up with a grand unified design
> >> > for everything that's fine, we can deprecate the new QMP commands
> >> > I'm proposing now.
> >> 
> >> Failing at coming up with an interface that has a reasonable chance to
> >> be future-proof is okay.
> >> 
> >> Not even trying is not okay.
> >
> > This was raised in my v1 posting:
> >
> >   https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg01346.html
> >
> > but the conclusion was that it was a non-trivial amount of extra
> > implementation work
> 
> Thanks for the pointer.  I've now read that review thread.
> 
> >> Specifically, I'd like you to think about monolothic snapshot command
> >> (internal snapshots only by design) vs. transaction of individual
> >> snapshot commands (design is not restricted to internal snapshots, but
> >> we may want to accept implementation restrictions).
> >> 
> >> We already have transactionable individual storage snapshot commands.
> >> What's missing is a transactionable machine state snapshot command.
> >
> > At a high level I consider what I've proposed as being higher level
> > syntax sugar vs a more generic future impl based on multiple commands
> > for snapshotting disk & state. I don't think I'd claim that it will
> > evolve to become the design you're suggesting here, as they are designs
> > from different levels in the stack. IOW, I think the would ultimately
> > just exist in parallel. I don't think that's a real problem from a
> > maint POV, as the large burden from the monolithic snapshot code is
> > not the HMP/QMP interface, but rather the guts of the internal
> > impl in the migration/savevm.c and block/snapshot.c files. That code
> > will exist for as long as the HMP commands exist, and adding a QMP
> > interface doesn't make that situation worse unless we were intending
> > to drop the existing HMP commands. If we did change our minds though,
> > we can just deprecate the QMP command at any time we like.
> >
> >
> >> One restriction I'd readily accept at this time is "the machine state
> >> snapshot must write to a QCOW2 that is also internally snapshot in the
> >> same transaction".
> >> 
> >> Now explain to me why this is impractical.
> >
> > The issues were described by Kevin here:
> >
> >   https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg02057.html
> >
> > Assuming the migration impl is actually possible to solve, there is
> > still the question of actually writing it. That's a non-trivial
> > amount of work someone has to find time for.
> 
> Kevin explained how the transactionable machine state snapshot command
> should be made non-blocking: post-copy.
> 
> I don't dispute that creating such a post-copy snapshot is a non-trivial
> task.  It is out of reach for you and me.  I didn't actually ask for it,
> though.
> 
> You argue that providing a blocking snapshot in QMP is better than
> nothing, and good enough for quite a few applications.  I agree!  I
> blocked prior attempts at porting HMP's savevm/loadvm to QMP not because
> they were blocking, but because they stuck to the HMP interface, and the
> HMP interface is bonkers.  I would accept the restriction "snapshotting
> machine state is blocking, i.e. it stops the machine."  As I wrote in
> 2016, "Limitations: No live internal machine snapshot, yet."

FYI, when I documented the new QAPI commands I implemented, i choose to
*not* say that the snapshot is blocking. Instead I said:

  # Applications should not assume that the snapshot load is complete
  # when this command returns. Completion is indicated by the job
  # status. Clients can wait for the JOB_STATUS_CHANGE event. If the
  # job aborts, errors can be obtained via the 'query-jobs' command,
  # though.

The idea was that if we fix these QAPI commands to not block in future,
we 

Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-09-01 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> On Thu, Aug 27, 2020 at 01:04:43PM +0200, Markus Armbruster wrote:
>> Daniel P. Berrangé  writes:
>> 
>> > On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
>> > From the POV of practicality, making a design that unifies internal
>> > and external snapshots is something I'm considering out of scope.
>> > It increases the design time burden, as well as implementation burden.
>> > On my side, improving internal snapshots is a "spare time" project,
>> > not something I can justify spending weeks or months on.
>> 
>> I'm not demanding a solution that unifies internal and external
>> snapshots.  I'm asking for a bit of serious thought on an interface that
>> could compatibly evolve there.  Hours, not weeks or months.
>> 
>> > My goal is to implement something that is achievable in a short
>> > amount of time that gets us out of the hole we've been in for 10
>> > years. Minimal refactoring of the internal snapshot code aside
>> > from fixing the critical limitations we have today around choice
>> > of disks to snapshot.
>> >
>> > If someone later wants to come up with a grand unified design
>> > for everything that's fine, we can deprecate the new QMP commands
>> > I'm proposing now.
>> 
>> Failing at coming up with an interface that has a reasonable chance to
>> be future-proof is okay.
>> 
>> Not even trying is not okay.
>
> This was raised in my v1 posting:
>
>   https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg01346.html
>
> but the conclusion was that it was a non-trivial amount of extra
> implementation work

Thanks for the pointer.  I've now read that review thread.

>> Specifically, I'd like you to think about monolothic snapshot command
>> (internal snapshots only by design) vs. transaction of individual
>> snapshot commands (design is not restricted to internal snapshots, but
>> we may want to accept implementation restrictions).
>> 
>> We already have transactionable individual storage snapshot commands.
>> What's missing is a transactionable machine state snapshot command.
>
> At a high level I consider what I've proposed as being higher level
> syntax sugar vs a more generic future impl based on multiple commands
> for snapshotting disk & state. I don't think I'd claim that it will
> evolve to become the design you're suggesting here, as they are designs
> from different levels in the stack. IOW, I think the would ultimately
> just exist in parallel. I don't think that's a real problem from a
> maint POV, as the large burden from the monolithic snapshot code is
> not the HMP/QMP interface, but rather the guts of the internal
> impl in the migration/savevm.c and block/snapshot.c files. That code
> will exist for as long as the HMP commands exist, and adding a QMP
> interface doesn't make that situation worse unless we were intending
> to drop the existing HMP commands. If we did change our minds though,
> we can just deprecate the QMP command at any time we like.
>
>
>> One restriction I'd readily accept at this time is "the machine state
>> snapshot must write to a QCOW2 that is also internally snapshot in the
>> same transaction".
>> 
>> Now explain to me why this is impractical.
>
> The issues were described by Kevin here:
>
>   https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg02057.html
>
> Assuming the migration impl is actually possible to solve, there is
> still the question of actually writing it. That's a non-trivial
> amount of work someone has to find time for.

Kevin explained how the transactionable machine state snapshot command
should be made non-blocking: post-copy.

I don't dispute that creating such a post-copy snapshot is a non-trivial
task.  It is out of reach for you and me.  I didn't actually ask for it,
though.

You argue that providing a blocking snapshot in QMP is better than
nothing, and good enough for quite a few applications.  I agree!  I
blocked prior attempts at porting HMP's savevm/loadvm to QMP not because
they were blocking, but because they stuck to the HMP interface, and the
HMP interface is bonkers.  I would accept the restriction "snapshotting
machine state is blocking, i.e. it stops the machine."  As I wrote in
2016, "Limitations: No live internal machine snapshot, yet."

Aside: unless I'm mistaken, taking an internal block device snapshot is
also blocking, but unlike taking a machine state snapshot, it's fast
enough for the blocking not to matter.  That's the "sync" in
blockdev-snapshot-internal-sync.

I asked you to consider the interface design I proposed back in 2016.
You point out above that your interface is more high-level, and could be
turned into sugar for a low level interface.

If true, this means your proposal doesn't box us into a corner, which is
good.

Let me elaborate a bit on the desugaring, just to make sure we're on the
same page.  Please correct me where I'm talking nonsense.

snapshot-save creates job that snapshots a set of block devices and the
machine state.  The snapshots 

Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-31 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 28.08.2020 um 08:20 hat Markus Armbruster geschrieben:
>> Kevin Wolf  writes:
>> 
>> > Am 27.08.2020 um 13:06 hat Markus Armbruster geschrieben:
>> >> Daniel P. Berrangé  writes:
>> >> 
>> >> > On Wed, Aug 26, 2020 at 07:28:24PM +0100, Daniel P. Berrangé wrote:
>> >> >> On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
>> >> >> > Open questions:
>> >> >> > 
>> >> >> > * Do we want the QMP command to delete existing snapshots with
>> >> >> >   conflicting tag / ID, like HMP savevm does?  Or do we want it to 
>> >> >> > fail
>> >> >> >   the transaction?
>> >> >> 
>> >> >> The intent is for the QMP commands to operate exclusively on
>> >> >> 'tags', and never consider "ID".
>> >> >
>> >> > I forgot that even HMP ignores "ID" now and works exclusively in terms
>> >> > of tags since:
>> >> >
>> >> >
>> >> >   commit 6ca080453ea403959ccde661030ca16264acc181
>> >> >   Author: Daniel Henrique Barboza 
>> >> >   Date:   Wed Nov 7 11:09:58 2018 -0200
>> >> >
>> >> > block/snapshot.c: eliminate use of ID input in snapshot operations
>> >> 
>> >> Almost a year after I sent the memo I quoted.  It's an incompatible
>> >> change, but nobody complained, and I'm glad we got this issue out of the
>> >> way.
>> >
>> > FWIW, I would have ignored any complaint about incompatible changes in
>> > HMP. It's not supposed to be a stable API, but UI.
>> 
>> The iffy part is actually the loss of ability to access snapshots that
>> lack a name.  Complaints about that would have been valid, I think.
>> Fortunately, there have been none.
>
> 'loadvm ""' should do the trick for these.

As long as you have at most one.

>The same way as you have to
> use with 'savevm' to create them in non-prehistoric versions of QEMU.
> We stopped creating snapshots with empty names by default in 0.14, so
> they are probably not very relevant any more. (Versioned machine types
> go back "only" to 1.0, so good luck loading a snapshot from an older
> version. And I wouldn't bet money either on a 1.0 snapshot still working
> with that machine type...)

No argument.




Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-28 Thread Kevin Wolf
Am 28.08.2020 um 08:20 hat Markus Armbruster geschrieben:
> Kevin Wolf  writes:
> 
> > Am 27.08.2020 um 13:06 hat Markus Armbruster geschrieben:
> >> Daniel P. Berrangé  writes:
> >> 
> >> > On Wed, Aug 26, 2020 at 07:28:24PM +0100, Daniel P. Berrangé wrote:
> >> >> On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
> >> >> > Open questions:
> >> >> > 
> >> >> > * Do we want the QMP command to delete existing snapshots with
> >> >> >   conflicting tag / ID, like HMP savevm does?  Or do we want it to 
> >> >> > fail
> >> >> >   the transaction?
> >> >> 
> >> >> The intent is for the QMP commands to operate exclusively on
> >> >> 'tags', and never consider "ID".
> >> >
> >> > I forgot that even HMP ignores "ID" now and works exclusively in terms
> >> > of tags since:
> >> >
> >> >
> >> >   commit 6ca080453ea403959ccde661030ca16264acc181
> >> >   Author: Daniel Henrique Barboza 
> >> >   Date:   Wed Nov 7 11:09:58 2018 -0200
> >> >
> >> > block/snapshot.c: eliminate use of ID input in snapshot operations
> >> 
> >> Almost a year after I sent the memo I quoted.  It's an incompatible
> >> change, but nobody complained, and I'm glad we got this issue out of the
> >> way.
> >
> > FWIW, I would have ignored any complaint about incompatible changes in
> > HMP. It's not supposed to be a stable API, but UI.
> 
> The iffy part is actually the loss of ability to access snapshots that
> lack a name.  Complaints about that would have been valid, I think.
> Fortunately, there have been none.

'loadvm ""' should do the trick for these. The same way as you have to
use with 'savevm' to create them in non-prehistoric versions of QEMU.
We stopped creating snapshots with empty names by default in 0.14, so
they are probably not very relevant any more. (Versioned machine types
go back "only" to 1.0, so good luck loading a snapshot from an older
version. And I wouldn't bet money either on a 1.0 snapshot still working
with that machine type...)

Kevin




Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-28 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 27.08.2020 um 13:06 hat Markus Armbruster geschrieben:
>> Daniel P. Berrangé  writes:
>> 
>> > On Wed, Aug 26, 2020 at 07:28:24PM +0100, Daniel P. Berrangé wrote:
>> >> On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
>> >> > Open questions:
>> >> > 
>> >> > * Do we want the QMP command to delete existing snapshots with
>> >> >   conflicting tag / ID, like HMP savevm does?  Or do we want it to fail
>> >> >   the transaction?
>> >> 
>> >> The intent is for the QMP commands to operate exclusively on
>> >> 'tags', and never consider "ID".
>> >
>> > I forgot that even HMP ignores "ID" now and works exclusively in terms
>> > of tags since:
>> >
>> >
>> >   commit 6ca080453ea403959ccde661030ca16264acc181
>> >   Author: Daniel Henrique Barboza 
>> >   Date:   Wed Nov 7 11:09:58 2018 -0200
>> >
>> > block/snapshot.c: eliminate use of ID input in snapshot operations
>> 
>> Almost a year after I sent the memo I quoted.  It's an incompatible
>> change, but nobody complained, and I'm glad we got this issue out of the
>> way.
>
> FWIW, I would have ignored any complaint about incompatible changes in
> HMP. It's not supposed to be a stable API, but UI.

The iffy part is actually the loss of ability to access snapshots that
lack a name.  Complaints about that would have been valid, I think.
Fortunately, there have been none.




Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-27 Thread Kevin Wolf
Am 27.08.2020 um 13:06 hat Markus Armbruster geschrieben:
> Daniel P. Berrangé  writes:
> 
> > On Wed, Aug 26, 2020 at 07:28:24PM +0100, Daniel P. Berrangé wrote:
> >> On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
> >> > Open questions:
> >> > 
> >> > * Do we want the QMP command to delete existing snapshots with
> >> >   conflicting tag / ID, like HMP savevm does?  Or do we want it to fail
> >> >   the transaction?
> >> 
> >> The intent is for the QMP commands to operate exclusively on
> >> 'tags', and never consider "ID".
> >
> > I forgot that even HMP ignores "ID" now and works exclusively in terms
> > of tags since:
> >
> >
> >   commit 6ca080453ea403959ccde661030ca16264acc181
> >   Author: Daniel Henrique Barboza 
> >   Date:   Wed Nov 7 11:09:58 2018 -0200
> >
> > block/snapshot.c: eliminate use of ID input in snapshot operations
> 
> Almost a year after I sent the memo I quoted.  It's an incompatible
> change, but nobody complained, and I'm glad we got this issue out of the
> way.

FWIW, I would have ignored any complaint about incompatible changes in
HMP. It's not supposed to be a stable API, but UI.

Kevin




Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-27 Thread Daniel P . Berrangé
On Thu, Aug 27, 2020 at 01:04:43PM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé  writes:
> 
> > On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
> > From the POV of practicality, making a design that unifies internal
> > and external snapshots is something I'm considering out of scope.
> > It increases the design time burden, as well as implementation burden.
> > On my side, improving internal snapshots is a "spare time" project,
> > not something I can justify spending weeks or months on.
> 
> I'm not demanding a solution that unifies internal and external
> snapshots.  I'm asking for a bit of serious thought on an interface that
> could compatibly evolve there.  Hours, not weeks or months.
> 
> > My goal is to implement something that is achievable in a short
> > amount of time that gets us out of the hole we've been in for 10
> > years. Minimal refactoring of the internal snapshot code aside
> > from fixing the critical limitations we have today around choice
> > of disks to snapshot.
> >
> > If someone later wants to come up with a grand unified design
> > for everything that's fine, we can deprecate the new QMP commands
> > I'm proposing now.
> 
> Failing at coming up with an interface that has a reasonable chance to
> be future-proof is okay.
> 
> Not even trying is not okay.

This was raised in my v1 posting:

  https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg01346.html

but the conclusion was that it was a non-trivial amount of extra
implementation work


> Specifically, I'd like you to think about monolothic snapshot command
> (internal snapshots only by design) vs. transaction of individual
> snapshot commands (design is not restricted to internal snapshots, but
> we may want to accept implementation restrictions).
> 
> We already have transactionable individual storage snapshot commands.
> What's missing is a transactionable machine state snapshot command.

At a high level I consider what I've proposed as being higher level
syntax sugar vs a more generic future impl based on multiple commands
for snapshotting disk & state. I don't think I'd claim that it will
evolve to become the design you're suggesting here, as they are designs
from different levels in the stack. IOW, I think the would ultimately
just exist in parallel. I don't think that's a real problem from a
maint POV, as the large burden from the monolithic snapshot code is
not the HMP/QMP interface, but rather the guts of the internal
impl in the migration/savevm.c and block/snapshot.c files. That code
will exist for as long as the HMP commands exist, and adding a QMP
interface doesn't make that situation worse unless we were intending
to drop the existing HMP commands. If we did change our minds though,
we can just deprecate the QMP command at any time we like.


> One restriction I'd readily accept at this time is "the machine state
> snapshot must write to a QCOW2 that is also internally snapshot in the
> same transaction".
> 
> Now explain to me why this is impractical.

The issues were described by Kevin here:

  https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg02057.html

Assuming the migration impl is actually possible to solve, there is
still the question of actually writing it. That's a non-trivial
amount of work someone has to find time for.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-27 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> On Wed, Aug 26, 2020 at 07:28:24PM +0100, Daniel P. Berrangé wrote:
>> On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
>> > Open questions:
>> > 
>> > * Do we want the QMP command to delete existing snapshots with
>> >   conflicting tag / ID, like HMP savevm does?  Or do we want it to fail
>> >   the transaction?
>> 
>> The intent is for the QMP commands to operate exclusively on
>> 'tags', and never consider "ID".
>
> I forgot that even HMP ignores "ID" now and works exclusively in terms
> of tags since:
>
>
>   commit 6ca080453ea403959ccde661030ca16264acc181
>   Author: Daniel Henrique Barboza 
>   Date:   Wed Nov 7 11:09:58 2018 -0200
>
> block/snapshot.c: eliminate use of ID input in snapshot operations

Almost a year after I sent the memo I quoted.  It's an incompatible
change, but nobody complained, and I'm glad we got this issue out of the
way.




Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-27 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
>> Sorry for taking so long to reply.
>> 
>> Daniel P. Berrangé  writes:
>> 
>> > A followup to:
>> >
>> >  v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00866.html
>> >
>> > When QMP was first introduced some 10+ years ago now, the snapshot
>> > related commands (savevm/loadvm/delvm) were not converted. This was
>> > primarily because their implementation causes blocking of the thread
>> > running the monitor commands. This was (and still is) considered
>> > undesirable behaviour both in HMP and QMP.
>> 
>> One of several reasons.
>> 
>> > In theory someone was supposed to fix this flaw at some point in the
>> > past 10 years and bring them into the QMP world. Sadly, thus far it
>> > hasn't happened as people always had more important things to work
>> > on. Enterprise apps were much more interested in external snapshots
>> > than internal snapshots as they have many more features.
>> 
>> Several attempts have been made to bring the functionality to QMP.
>> Sadly, they went nowhere.
>> 
>> I posted an analysis of the issues in reply to one of the more serious
>> attempts:
>> 
>> Message-ID: <87lh7l783q@blackfin.pond.sub.org>
>> https://lists.nongnu.org/archive/html/qemu-devel/2016-01/msg03593.html
>> 
>> I'd like to hear your take on it.  I append the relevant part for your
>> convenience.  Perhaps your code is already close to what I describe
>> there.  I'm interested in where it falls short.
>> 
>> > Meanwhile users still want to use internal snapshots as there is
>> > a certainly simplicity in having everything self-contained in one
>> > image, even though it has limitations. Thus the apps that end up
>> > executing the savevm/loadvm/delvm via the "human-monitor-command"
>> > QMP command.
>> >
>> > IOW, the problematic blocking behaviour that was one of the reasons
>> > for not having savevm/loadvm/delvm in QMP is experienced by applications
>> > regardless. By not portting the commands to QMP due to one design flaw,
>> > we've forced apps and users to suffer from other design flaws of HMP (
>> > bad error reporting, strong type checking of args, no introspection) for
>> > an additional 10 years. This feels rather sub-optimal :-(
>> >
>> > In practice users don't appear to care strongly about the fact that these
>> > commands block the VM while they run. I might have seen one bug report
>> > about it, but it certainly isn't something that comes up as a frequent
>> > topic except among us QEMU maintainers. Users do care about having
>> > access to the snapshot feature.
>> >
>> > Where I am seeing frequent complaints is wrt the use of OVMF combined
>> > with snapshots which has some serious pain points. This is getting worse
>> > as the push to ditch legacy BIOS in favour of UEFI gain momentum both
>> > across OS vendors and mgmt apps. Solving it requires new parameters to
>> > the commands, but doing this in HMP is super unappealing.
>> >
>> > After 10 years, I think it is time for us to be a little pragmatic about
>> > our handling of snapshots commands. My desire is that libvirt should never
>> > use "human-monitor-command" under any circumstances, because of the
>> > inherant flaws in HMP as a protocol for machine consumption.
>> >
>> > Thus in this series I'm proposing a fairly direct mapping of the existing
>> > HMP commands for savevm/loadvm/delvm into QMP as a first step. This does
>> > not solve the blocking thread problem, but it does put in a place a
>> > design using the jobs framework which can facilitate solving it later.
>> > It does also solve the error reporting, type checking and introspection
>> > problems inherant to HMP. So we're winning on 3 out of the 4 problems,
>> > and pushed apps to a QMP design that will let us solve the last
>> > remaining problem.
>> >
>> > With a QMP variant, we reasonably deal with the problems related to OVMF:
>> >
>> >  - The logic to pick which disk to store the vmstate in is not
>> >satsifactory.
>> >
>> >The first block driver state cannot be assumed to be the root disk
>> >image, it might be OVMF varstore and we don't want to store vmstate
>> >in there.
>> 
>> Yes, this is one of the issues.  Glad you're addressing it.
>> 
>> >  - The logic to decide which disks must be snapshotted is hardwired
>> >to all disks which are writable
>> >
>> >Again with OVMF there might be a writable varstore, but this can be
>> >raw rather than qcow2 format, and thus unable to be snapshotted.
>> >While users might wish to snapshot their varstore, in some/many/most
>> >cases it is entirely uneccessary. Users are blocked from snapshotting
>> >their VM though due to this varstore.
>> 
>> Another one.  Glad again.
>> 
>> > These are solved by adding two parameters to the commands. The first is
>> > a block device node name that identifies the image to store vmstate in,
>> > and the second is a list of node 

Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-26 Thread Daniel P . Berrangé
On Wed, Aug 26, 2020 at 07:28:24PM +0100, Daniel P. Berrangé wrote:
> On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
> > Open questions:
> > 
> > * Do we want the QMP command to delete existing snapshots with
> >   conflicting tag / ID, like HMP savevm does?  Or do we want it to fail
> >   the transaction?
> 
> The intent is for the QMP commands to operate exclusively on
> 'tags', and never consider "ID".

I forgot that even HMP ignores "ID" now and works exclusively in terms
of tags since:


  commit 6ca080453ea403959ccde661030ca16264acc181
  Author: Daniel Henrique Barboza 
  Date:   Wed Nov 7 11:09:58 2018 -0200

block/snapshot.c: eliminate use of ID input in snapshot operations


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-26 Thread Daniel P . Berrangé
On Wed, Aug 26, 2020 at 05:52:06PM +0200, Markus Armbruster wrote:
> Sorry for taking so long to reply.
> 
> Daniel P. Berrangé  writes:
> 
> > A followup to:
> >
> >  v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00866.html
> >
> > When QMP was first introduced some 10+ years ago now, the snapshot
> > related commands (savevm/loadvm/delvm) were not converted. This was
> > primarily because their implementation causes blocking of the thread
> > running the monitor commands. This was (and still is) considered
> > undesirable behaviour both in HMP and QMP.
> 
> One of several reasons.
> 
> > In theory someone was supposed to fix this flaw at some point in the
> > past 10 years and bring them into the QMP world. Sadly, thus far it
> > hasn't happened as people always had more important things to work
> > on. Enterprise apps were much more interested in external snapshots
> > than internal snapshots as they have many more features.
> 
> Several attempts have been made to bring the functionality to QMP.
> Sadly, they went nowhere.
> 
> I posted an analysis of the issues in reply to one of the more serious
> attempts:
> 
> Message-ID: <87lh7l783q@blackfin.pond.sub.org>
> https://lists.nongnu.org/archive/html/qemu-devel/2016-01/msg03593.html
> 
> I'd like to hear your take on it.  I append the relevant part for your
> convenience.  Perhaps your code is already close to what I describe
> there.  I'm interested in where it falls short.
> 
> > Meanwhile users still want to use internal snapshots as there is
> > a certainly simplicity in having everything self-contained in one
> > image, even though it has limitations. Thus the apps that end up
> > executing the savevm/loadvm/delvm via the "human-monitor-command"
> > QMP command.
> >
> > IOW, the problematic blocking behaviour that was one of the reasons
> > for not having savevm/loadvm/delvm in QMP is experienced by applications
> > regardless. By not portting the commands to QMP due to one design flaw,
> > we've forced apps and users to suffer from other design flaws of HMP (
> > bad error reporting, strong type checking of args, no introspection) for
> > an additional 10 years. This feels rather sub-optimal :-(
> >
> > In practice users don't appear to care strongly about the fact that these
> > commands block the VM while they run. I might have seen one bug report
> > about it, but it certainly isn't something that comes up as a frequent
> > topic except among us QEMU maintainers. Users do care about having
> > access to the snapshot feature.
> >
> > Where I am seeing frequent complaints is wrt the use of OVMF combined
> > with snapshots which has some serious pain points. This is getting worse
> > as the push to ditch legacy BIOS in favour of UEFI gain momentum both
> > across OS vendors and mgmt apps. Solving it requires new parameters to
> > the commands, but doing this in HMP is super unappealing.
> >
> > After 10 years, I think it is time for us to be a little pragmatic about
> > our handling of snapshots commands. My desire is that libvirt should never
> > use "human-monitor-command" under any circumstances, because of the
> > inherant flaws in HMP as a protocol for machine consumption.
> >
> > Thus in this series I'm proposing a fairly direct mapping of the existing
> > HMP commands for savevm/loadvm/delvm into QMP as a first step. This does
> > not solve the blocking thread problem, but it does put in a place a
> > design using the jobs framework which can facilitate solving it later.
> > It does also solve the error reporting, type checking and introspection
> > problems inherant to HMP. So we're winning on 3 out of the 4 problems,
> > and pushed apps to a QMP design that will let us solve the last
> > remaining problem.
> >
> > With a QMP variant, we reasonably deal with the problems related to OVMF:
> >
> >  - The logic to pick which disk to store the vmstate in is not
> >satsifactory.
> >
> >The first block driver state cannot be assumed to be the root disk
> >image, it might be OVMF varstore and we don't want to store vmstate
> >in there.
> 
> Yes, this is one of the issues.  Glad you're addressing it.
> 
> >  - The logic to decide which disks must be snapshotted is hardwired
> >to all disks which are writable
> >
> >Again with OVMF there might be a writable varstore, but this can be
> >raw rather than qcow2 format, and thus unable to be snapshotted.
> >While users might wish to snapshot their varstore, in some/many/most
> >cases it is entirely uneccessary. Users are blocked from snapshotting
> >their VM though due to this varstore.
> 
> Another one.  Glad again.
> 
> > These are solved by adding two parameters to the commands. The first is
> > a block device node name that identifies the image to store vmstate in,
> > and the second is a list of node names to include for the snapshots.
> > If the list of nodes isn't given, it falls back to the historical
> > behaviour of using all 

Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-26 Thread Markus Armbruster
Sorry for taking so long to reply.

Daniel P. Berrangé  writes:

> A followup to:
>
>  v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00866.html
>
> When QMP was first introduced some 10+ years ago now, the snapshot
> related commands (savevm/loadvm/delvm) were not converted. This was
> primarily because their implementation causes blocking of the thread
> running the monitor commands. This was (and still is) considered
> undesirable behaviour both in HMP and QMP.

One of several reasons.

> In theory someone was supposed to fix this flaw at some point in the
> past 10 years and bring them into the QMP world. Sadly, thus far it
> hasn't happened as people always had more important things to work
> on. Enterprise apps were much more interested in external snapshots
> than internal snapshots as they have many more features.

Several attempts have been made to bring the functionality to QMP.
Sadly, they went nowhere.

I posted an analysis of the issues in reply to one of the more serious
attempts:

Message-ID: <87lh7l783q@blackfin.pond.sub.org>
https://lists.nongnu.org/archive/html/qemu-devel/2016-01/msg03593.html

I'd like to hear your take on it.  I append the relevant part for your
convenience.  Perhaps your code is already close to what I describe
there.  I'm interested in where it falls short.

> Meanwhile users still want to use internal snapshots as there is
> a certainly simplicity in having everything self-contained in one
> image, even though it has limitations. Thus the apps that end up
> executing the savevm/loadvm/delvm via the "human-monitor-command"
> QMP command.
>
> IOW, the problematic blocking behaviour that was one of the reasons
> for not having savevm/loadvm/delvm in QMP is experienced by applications
> regardless. By not portting the commands to QMP due to one design flaw,
> we've forced apps and users to suffer from other design flaws of HMP (
> bad error reporting, strong type checking of args, no introspection) for
> an additional 10 years. This feels rather sub-optimal :-(
>
> In practice users don't appear to care strongly about the fact that these
> commands block the VM while they run. I might have seen one bug report
> about it, but it certainly isn't something that comes up as a frequent
> topic except among us QEMU maintainers. Users do care about having
> access to the snapshot feature.
>
> Where I am seeing frequent complaints is wrt the use of OVMF combined
> with snapshots which has some serious pain points. This is getting worse
> as the push to ditch legacy BIOS in favour of UEFI gain momentum both
> across OS vendors and mgmt apps. Solving it requires new parameters to
> the commands, but doing this in HMP is super unappealing.
>
> After 10 years, I think it is time for us to be a little pragmatic about
> our handling of snapshots commands. My desire is that libvirt should never
> use "human-monitor-command" under any circumstances, because of the
> inherant flaws in HMP as a protocol for machine consumption.
>
> Thus in this series I'm proposing a fairly direct mapping of the existing
> HMP commands for savevm/loadvm/delvm into QMP as a first step. This does
> not solve the blocking thread problem, but it does put in a place a
> design using the jobs framework which can facilitate solving it later.
> It does also solve the error reporting, type checking and introspection
> problems inherant to HMP. So we're winning on 3 out of the 4 problems,
> and pushed apps to a QMP design that will let us solve the last
> remaining problem.
>
> With a QMP variant, we reasonably deal with the problems related to OVMF:
>
>  - The logic to pick which disk to store the vmstate in is not
>satsifactory.
>
>The first block driver state cannot be assumed to be the root disk
>image, it might be OVMF varstore and we don't want to store vmstate
>in there.

Yes, this is one of the issues.  Glad you're addressing it.

>  - The logic to decide which disks must be snapshotted is hardwired
>to all disks which are writable
>
>Again with OVMF there might be a writable varstore, but this can be
>raw rather than qcow2 format, and thus unable to be snapshotted.
>While users might wish to snapshot their varstore, in some/many/most
>cases it is entirely uneccessary. Users are blocked from snapshotting
>their VM though due to this varstore.

Another one.  Glad again.

> These are solved by adding two parameters to the commands. The first is
> a block device node name that identifies the image to store vmstate in,
> and the second is a list of node names to include for the snapshots.
> If the list of nodes isn't given, it falls back to the historical
> behaviour of using all disks matching some undocumented criteria.
>
> In the block code I've only dealt with node names for block devices, as
> IIUC, this is all that libvirt should need in the -blockdev world it now
> lives in. IOW, I've made not attempt to cope with people wanting to use
> 

Re: [PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-08-21 Thread Daniel P . Berrangé
On Mon, Jul 27, 2020 at 04:08:37PM +0100, Daniel P. Berrangé wrote:
> A followup to:
> 
>  v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00866.html

snip

> HELP NEEDED:  this series starts to implement the approach that Kevin
> suggested wrto use of generic jobs.
> 
> When I try to actually run the code though it crashes:
> 
> ERROR:/home/berrange/src/virt/qemu/softmmu/cpus.c:1788:qemu_mutex_unlock_ioth=
> read: assertion failed: (qemu_mutex_iothread_locked())
> Bail out! ERROR:/home/berrange/src/virt/qemu/softmmu/cpus.c:1788:qemu_mutex_u=
> nlock_iothread: assertion failed: (qemu_mutex_iothread_locked())
> 
> Obviously I've missed something related to locking, but I've no idea
> what, so I'm sending this v2 simply as a way to solicit suggestions
> for what I've messed up.

What I've found is

qmp_snapshot_save() is the QMP handler and runs in the main thread, so iothread
lock is held.


This calls job_create() which ends up invoking  snapshot_save_job_run
in a background coroutine, but IIUC  iothread lock is still held when
the coroutine starts.

This then invokes save_snapshot() which invokes qemu_savevm_state


This calls   qemu_mutex_unlock_iothread() and then 
qemu_savevm_state_setup().

Eventually something in the qcow2 code triggers qemu_coroutine_yield()
so control goes back to the main event loop thread.


The problem is that the iothread lock has been released, but the main
event loop thread is still expecting it to be held.

I've no idea how to go about solving this problem.


The save_snapshot() code, as written today, needs to run serialized with
everything else, but because the job framework has used a coroutine to
run it, we can switch back to the main event thread at any time.

I don't know how to force save_snapshot() to be serialized when using
the generic job framework.


> 
> You can reproduce with I/O tests using "check -qcow2 310"  and it
> gave a stack:
> 
>   Thread 5 (Thread 0x7fffe6e4c700 (LWP 3399011)):
>   #0  futex_wait_cancelable (private=0, expected=0, 
> futex_word=0x566a9fd8) at ../sysdeps/nptl/futex-internal.h:183
>   #1  __pthread_cond_wait_common (abstime=0x0, clockid=0, 
> mutex=0x56227160 , cond=0x566a9fb0) at 
> pthread_cond_wait.c:508
>   #2  __pthread_cond_wait (cond=cond@entry=0x566a9fb0, 
> mutex=mutex@entry=0x56227160 ) at 
> pthread_cond_wait.c:638
>   #3  0x55ceb6cb in qemu_cond_wait_impl (cond=0x566a9fb0, 
> mutex=0x56227160 , file=0x55d44198 
> "/home/berrange/src/virt/qemu/softmmu/cpus.c", line=1145) at 
> /home/berrange/src/virt/qemu/util/qemu-thread-posix.c:174
>   #4  0x55931974 in qemu_wait_io_event (cpu=cpu@entry=0x56685050) 
> at /home/berrange/src/virt/qemu/softmmu/cpus.c:1145
>   #5  0x55933a89 in qemu_dummy_cpu_thread_fn 
> (arg=arg@entry=0x56685050) at 
> /home/berrange/src/virt/qemu/softmmu/cpus.c:1241
>   #6  0x55ceb049 in qemu_thread_start (args=0x7fffe6e476f0) at 
> /home/berrange/src/virt/qemu/util/qemu-thread-posix.c:521
>   #7  0x74fdc432 in start_thread (arg=) at 
> pthread_create.c:477
>   #8  0x74f0a9d3 in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>   
>   Thread 4 (Thread 0x7fffe764d700 (LWP 3399010)):
>   #0  0x74effb6f in __GI___poll (fds=0x7fffdc006ec0, nfds=3, 
> timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
>   #1  0x77c1aace in g_main_context_iterate.constprop () at 
> /lib64/libglib-2.0.so.0
>   #2  0x77c1ae53 in g_main_loop_run () at /lib64/libglib-2.0.so.0
>   #3  0x559a9d81 in iothread_run (opaque=opaque@entry=0x5632f200) 
> at /home/berrange/src/virt/qemu/iothread.c:82
>   #4  0x55ceb049 in qemu_thread_start (args=0x7fffe76486f0) at 
> /home/berrange/src/virt/qemu/util/qemu-thread-posix.c:521
>   #5  0x74fdc432 in start_thread (arg=) at 
> pthread_create.c:477
>   #6  0x74f0a9d3 in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>   
>   Thread 3 (Thread 0x7fffe7e4e700 (LWP 3399009)):
>   #0  0x74fe5c58 in futex_abstimed_wait_cancelable (private=0, 
> abstime=0x7fffe7e49650, clockid=0, expected=0, futex_word=0x562bf888) at 
> ../sysdeps/nptl/futex-internal.h:320
>   #1  do_futex_wait (sem=sem@entry=0x562bf888, 
> abstime=abstime@entry=0x7fffe7e49650, clockid=0) at sem_waitcommon.c:112
>   #2  0x74fe5d83 in __new_sem_wait_slow 
> (sem=sem@entry=0x562bf888, abstime=abstime@entry=0x7fffe7e49650, 
> clockid=0) at sem_waitcommon.c:184
>   #3  0x74fe5e12 in sem_timedwait (sem=sem@entry=0x562bf888, 
> abstime=abstime@entry=0x7fffe7e49650) at sem_timedwait.c:40
>   #4  0x55cebbdf in qemu_sem_timedwait (sem=sem@entry=0x562bf888, 
> ms=ms@entry=1) at 
> /home/berrange/src/virt/qemu/util/qemu-thread-posix.c:307
>   #5  0x55d03fa4 in worker_thread 
> (opaque=opaque@entry=0x562bf810) at 
> /home/berrange/src/virt/qemu/util/thread-pool.c:91
>   #6  0x55ceb049 in 

[PATCH v2 (BROKEN) 0/6] migration: bring improved savevm/loadvm/delvm to QMP

2020-07-27 Thread Daniel P . Berrangé
A followup to:

 v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00866.html

When QMP was first introduced some 10+ years ago now, the snapshot
related commands (savevm/loadvm/delvm) were not converted. This was
primarily because their implementation causes blocking of the thread
running the monitor commands. This was (and still is) considered
undesirable behaviour both in HMP and QMP.

In theory someone was supposed to fix this flaw at some point in the
past 10 years and bring them into the QMP world. Sadly, thus far it
hasn't happened as people always had more important things to work
on. Enterprise apps were much more interested in external snapshots
than internal snapshots as they have many more features.

Meanwhile users still want to use internal snapshots as there is
a certainly simplicity in having everything self-contained in one
image, even though it has limitations. Thus the apps that end up
executing the savevm/loadvm/delvm via the "human-monitor-command"
QMP command.

IOW, the problematic blocking behaviour that was one of the reasons
for not having savevm/loadvm/delvm in QMP is experienced by applications
regardless. By not portting the commands to QMP due to one design flaw,
we've forced apps and users to suffer from other design flaws of HMP (
bad error reporting, strong type checking of args, no introspection) for
an additional 10 years. This feels rather sub-optimal :-(

In practice users don't appear to care strongly about the fact that these
commands block the VM while they run. I might have seen one bug report
about it, but it certainly isn't something that comes up as a frequent
topic except among us QEMU maintainers. Users do care about having
access to the snapshot feature.

Where I am seeing frequent complaints is wrt the use of OVMF combined
with snapshots which has some serious pain points. This is getting worse
as the push to ditch legacy BIOS in favour of UEFI gain momentum both
across OS vendors and mgmt apps. Solving it requires new parameters to
the commands, but doing this in HMP is super unappealing.

After 10 years, I think it is time for us to be a little pragmatic about
our handling of snapshots commands. My desire is that libvirt should never
use "human-monitor-command" under any circumstances, because of the
inherant flaws in HMP as a protocol for machine consumption.

Thus in this series I'm proposing a fairly direct mapping of the existing
HMP commands for savevm/loadvm/delvm into QMP as a first step. This does
not solve the blocking thread problem, but it does put in a place a
design using the jobs framework which can facilitate solving it later.
It does also solve the error reporting, type checking and introspection
problems inherant to HMP. So we're winning on 3 out of the 4 problems,
and pushed apps to a QMP design that will let us solve the last
remaining problem.

With a QMP variant, we reasonably deal with the problems related to OVMF:

 - The logic to pick which disk to store the vmstate in is not
   satsifactory.

   The first block driver state cannot be assumed to be the root disk
   image, it might be OVMF varstore and we don't want to store vmstate
   in there.

 - The logic to decide which disks must be snapshotted is hardwired
   to all disks which are writable

   Again with OVMF there might be a writable varstore, but this can be
   raw rather than qcow2 format, and thus unable to be snapshotted.
   While users might wish to snapshot their varstore, in some/many/most
   cases it is entirely uneccessary. Users are blocked from snapshotting
   their VM though due to this varstore.

These are solved by adding two parameters to the commands. The first is
a block device node name that identifies the image to store vmstate in,
and the second is a list of node names to include for the snapshots.
If the list of nodes isn't given, it falls back to the historical
behaviour of using all disks matching some undocumented criteria.

In the block code I've only dealt with node names for block devices, as
IIUC, this is all that libvirt should need in the -blockdev world it now
lives in. IOW, I've made not attempt to cope with people wanting to use
these QMP commands in combination with -drive args.

I've done some minimal work in libvirt to start to make use of the new
commands to validate their functionality, but this isn't finished yet.

My ultimate goal is to make the GNOME Boxes maintainer happy again by
having internal snapshots work with OVMF:

  https://gitlab.gnome.org/GNOME/gnome-boxes/-/commit/c486da262f6566326fbcb5e=
f45c5f64048f16a6e

HELP NEEDED:  this series starts to implement the approach that Kevin
suggested wrto use of generic jobs.

When I try to actually run the code though it crashes:

ERROR:/home/berrange/src/virt/qemu/softmmu/cpus.c:1788:qemu_mutex_unlock_ioth=
read: assertion failed: (qemu_mutex_iothread_locked())
Bail out! ERROR:/home/berrange/src/virt/qemu/softmmu/cpus.c:1788:qemu_mutex_u=
nlock_iothread: assertion failed: