Re: [libvirt] [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

2018-10-11 Thread Peter Krempa
On Thu, Oct 11, 2018 at 13:06:14 +0100, Dr. David Alan Gilbert wrote:
> * Peter Krempa (pkre...@redhat.com) wrote:
> > On Tue, Oct 09, 2018 at 19:34:43 +0200, Markus Armbruster wrote:
> > > Cc: libvir-list for review of the compatibility argument below.
> > > 
> > > Daniel Henrique Barboza  writes:
> > > 
> > > > Hey David,
> > > >
> > > > On 9/21/18 9:29 AM, Dr. David Alan Gilbert wrote:
> > > >> * Daniel Henrique Barboza (danielhb...@gmail.com) wrote:
> > > >>> changes in v2:
> > > >>> - removed the "RFC" marker;
> > > >>> - added a new patch (patch 2) that removes
> > > >>> bdrv_snapshot_delete_by_id_or_name from the code;
> > > >>> - made changes in patch 1 as suggested by Murilo;
> > > >>> - previous patch set link:
> > > >>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html
> > > >>>
> > > >>>
> > > >>> It is not uncommon to see bugs being opened by testers that attempt to
> > > >>> create VM snapshots using HMP. It turns out that "0" and "1" are quite
> > > >>> common snapshot names and they trigger a lot of bugs. I gave an 
> > > >>> example
> > > >>> in the commit message of patch 1, but to sum up here: QEMU treats the
> > > >>> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. 
> > > >>> It
> > > >>> is documented as such, but this can lead to strange situations.
> > > >>>
> > > >>> Given that it is strange for an API to consider a parameter to be 2 
> > > >>> fields
> > > >>> at the same time, and inadvently treating them as one or the other, 
> > > >>> and
> > > >>> that removing the ID field is too drastic, my idea here is to keep the
> > > >>> ID field for internal control, but do not let the user set it.
> > > >>>
> > > >>> I guess there's room for discussion about considering this change an 
> > > >>> API
> > > >>> change or not. It doesn't affect users of HMP and it doesn't affect 
> > > >>> Libvirt,
> > > >>> but simplifying the meaning of the parameters of savevm/loadvm/delvm.
> > > >> Can you clarify a couple of things:
> > > >>a) What is it about libvirt's use that means it's OK ? Does it never
> > > >> use numeric tags?
> > > >
> > > > I am glad you asked because my understanding in how Libvirt was dealing
> > > > with snapshots was wrong, and I just looked into it further to answer 
> > > > your
> > > > question. Luckily, this series fixes the situation for Libvirt as well.
> > > >
> > > > I was misled by the fact that Libvirt does not show the same symptoms
> > > > we see in
> > > > QEMU of this problem, but the bug is there. Here's a quick test with
> > > > Libvirt with
> > > > "0" and "1" as snapshot names, considering a VM named with no snapshots,
> > > > using QEMU 2.12 and Libvirt 4.0.0:
> > > >
> > > > - create the "0" snapshot:
> > > >
> > > > $ sudo virsh snapshot-create-as --name 0 dhb
> > > > Domain snapshot 0 created
> > > >
> > > > $ sudo virsh snapshot-list dhb
> > > > Name Creation Time State
> > > > 
> > > > 0 2018-09-24 15:47:56 -0400 running
> > > >
> > > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> > > > List of snapshots present on all disks:
> > > > ID TAG VM SIZE DATE VM CLOCK
> > > > --    0  405M 2018-09-24 15:47:56 00:04:20.867
> > > >
> > > >
> > > > - created the "1" snapshot. Here, Libvirt shows both snapshots with
> > > > snapshot-list,
> > > > but the snapshot was erased inside QEMU:
> > > >
> > > > $ sudo virsh snapshot-create-as --name 1 dhb
> > > > Domain snapshot 1 created
> > > > $
> > > > $ sudo virsh snapshot-list dhb
> > > > Name Creation Time State
> > > > 
> > > > 0 2018-09-24 15:47:56 -0400 running
> > > > 1 2018-09-24 15:50:09 -0400 running
> > > >
> > > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> > > > List of snapshots present on all disks:
> > > > ID TAG VM SIZE DATE VM CLOCK
> > > > --    1  404M 2018-09-24 15:50:10 00:05:36.226
> > > >
> > > >
> > > > This is where I stopped checking out Libvirt at first, concluding
> > > > wrongly that it
> > > > was immune to the bug.
> > > >
> > > > Libvirt does not throw an error when trying to apply snapshot 0. In
> > > > fact, it acts
> > > > like everything went fine:
> > > >
> > > > $ sudo virsh snapshot-revert dhb 0
> > > >
> > > > $ echo $?
> > > > 0
> > > 
> > > Is that a libvirt bug?
> > 
> > Probably yes. The error handling from HMP sucks and can't really be
> > fixed in all cases. This is for the handler which calls "loadvm":
> > 
> > if (strstr(reply, "No block device supports snapshots") != NULL) {
> > virReportError(VIR_ERR_OPERATION_INVALID, "%s",
> >_("this domain does not have a device to load 
> > snapshots"));
> > goto cleanup;
> > } else if (strstr(reply, "Could not find snapshot") != NULL) {
> > virReportError(VIR_ERR_OPERATION_INVALID,
> >_("the snapshot '%s' does not exist, and was not 
> > loaded"),
> >  

Re: [libvirt] [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

2018-10-11 Thread Dr. David Alan Gilbert
* Peter Krempa (pkre...@redhat.com) wrote:
> On Tue, Oct 09, 2018 at 19:34:43 +0200, Markus Armbruster wrote:
> > Cc: libvir-list for review of the compatibility argument below.
> > 
> > Daniel Henrique Barboza  writes:
> > 
> > > Hey David,
> > >
> > > On 9/21/18 9:29 AM, Dr. David Alan Gilbert wrote:
> > >> * Daniel Henrique Barboza (danielhb...@gmail.com) wrote:
> > >>> changes in v2:
> > >>> - removed the "RFC" marker;
> > >>> - added a new patch (patch 2) that removes
> > >>> bdrv_snapshot_delete_by_id_or_name from the code;
> > >>> - made changes in patch 1 as suggested by Murilo;
> > >>> - previous patch set link:
> > >>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html
> > >>>
> > >>>
> > >>> It is not uncommon to see bugs being opened by testers that attempt to
> > >>> create VM snapshots using HMP. It turns out that "0" and "1" are quite
> > >>> common snapshot names and they trigger a lot of bugs. I gave an example
> > >>> in the commit message of patch 1, but to sum up here: QEMU treats the
> > >>> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. It
> > >>> is documented as such, but this can lead to strange situations.
> > >>>
> > >>> Given that it is strange for an API to consider a parameter to be 2 
> > >>> fields
> > >>> at the same time, and inadvently treating them as one or the other, and
> > >>> that removing the ID field is too drastic, my idea here is to keep the
> > >>> ID field for internal control, but do not let the user set it.
> > >>>
> > >>> I guess there's room for discussion about considering this change an API
> > >>> change or not. It doesn't affect users of HMP and it doesn't affect 
> > >>> Libvirt,
> > >>> but simplifying the meaning of the parameters of savevm/loadvm/delvm.
> > >> Can you clarify a couple of things:
> > >>a) What is it about libvirt's use that means it's OK ? Does it never
> > >> use numeric tags?
> > >
> > > I am glad you asked because my understanding in how Libvirt was dealing
> > > with snapshots was wrong, and I just looked into it further to answer your
> > > question. Luckily, this series fixes the situation for Libvirt as well.
> > >
> > > I was misled by the fact that Libvirt does not show the same symptoms
> > > we see in
> > > QEMU of this problem, but the bug is there. Here's a quick test with
> > > Libvirt with
> > > "0" and "1" as snapshot names, considering a VM named with no snapshots,
> > > using QEMU 2.12 and Libvirt 4.0.0:
> > >
> > > - create the "0" snapshot:
> > >
> > > $ sudo virsh snapshot-create-as --name 0 dhb
> > > Domain snapshot 0 created
> > >
> > > $ sudo virsh snapshot-list dhb
> > > Name Creation Time State
> > > 
> > > 0 2018-09-24 15:47:56 -0400 running
> > >
> > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> > > List of snapshots present on all disks:
> > > ID TAG VM SIZE DATE VM CLOCK
> > > --    0  405M 2018-09-24 15:47:56 00:04:20.867
> > >
> > >
> > > - created the "1" snapshot. Here, Libvirt shows both snapshots with
> > > snapshot-list,
> > > but the snapshot was erased inside QEMU:
> > >
> > > $ sudo virsh snapshot-create-as --name 1 dhb
> > > Domain snapshot 1 created
> > > $
> > > $ sudo virsh snapshot-list dhb
> > > Name Creation Time State
> > > 
> > > 0 2018-09-24 15:47:56 -0400 running
> > > 1 2018-09-24 15:50:09 -0400 running
> > >
> > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> > > List of snapshots present on all disks:
> > > ID TAG VM SIZE DATE VM CLOCK
> > > --    1  404M 2018-09-24 15:50:10 00:05:36.226
> > >
> > >
> > > This is where I stopped checking out Libvirt at first, concluding
> > > wrongly that it
> > > was immune to the bug.
> > >
> > > Libvirt does not throw an error when trying to apply snapshot 0. In
> > > fact, it acts
> > > like everything went fine:
> > >
> > > $ sudo virsh snapshot-revert dhb 0
> > >
> > > $ echo $?
> > > 0
> > 
> > Is that a libvirt bug?
> 
> Probably yes. The error handling from HMP sucks and can't really be
> fixed in all cases. This is for the handler which calls "loadvm":
> 
> if (strstr(reply, "No block device supports snapshots") != NULL) {
> virReportError(VIR_ERR_OPERATION_INVALID, "%s",
>_("this domain does not have a device to load 
> snapshots"));
> goto cleanup;
> } else if (strstr(reply, "Could not find snapshot") != NULL) {
> virReportError(VIR_ERR_OPERATION_INVALID,
>_("the snapshot '%s' does not exist, and was not 
> loaded"),
>name);
> goto cleanup;
> } else if (strstr(reply, "Snapshots not supported on device") != NULL) {
> virReportError(VIR_ERR_OPERATION_INVALID, "%s", reply);
> goto cleanup;
> } else if (strstr(reply, "Could not open VM state file") != NULL) {
> virReportError(VIR_ERR_OPERATION_FAILED, "%s"

Re: [libvirt] [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

2018-10-10 Thread Peter Krempa
On Tue, Oct 09, 2018 at 19:34:43 +0200, Markus Armbruster wrote:
> Cc: libvir-list for review of the compatibility argument below.
> 
> Daniel Henrique Barboza  writes:
> 
> > Hey David,
> >
> > On 9/21/18 9:29 AM, Dr. David Alan Gilbert wrote:
> >> * Daniel Henrique Barboza (danielhb...@gmail.com) wrote:
> >>> changes in v2:
> >>> - removed the "RFC" marker;
> >>> - added a new patch (patch 2) that removes
> >>> bdrv_snapshot_delete_by_id_or_name from the code;
> >>> - made changes in patch 1 as suggested by Murilo;
> >>> - previous patch set link:
> >>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html
> >>>
> >>>
> >>> It is not uncommon to see bugs being opened by testers that attempt to
> >>> create VM snapshots using HMP. It turns out that "0" and "1" are quite
> >>> common snapshot names and they trigger a lot of bugs. I gave an example
> >>> in the commit message of patch 1, but to sum up here: QEMU treats the
> >>> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. It
> >>> is documented as such, but this can lead to strange situations.
> >>>
> >>> Given that it is strange for an API to consider a parameter to be 2 fields
> >>> at the same time, and inadvently treating them as one or the other, and
> >>> that removing the ID field is too drastic, my idea here is to keep the
> >>> ID field for internal control, but do not let the user set it.
> >>>
> >>> I guess there's room for discussion about considering this change an API
> >>> change or not. It doesn't affect users of HMP and it doesn't affect 
> >>> Libvirt,
> >>> but simplifying the meaning of the parameters of savevm/loadvm/delvm.
> >> Can you clarify a couple of things:
> >>a) What is it about libvirt's use that means it's OK ? Does it never
> >> use numeric tags?
> >
> > I am glad you asked because my understanding in how Libvirt was dealing
> > with snapshots was wrong, and I just looked into it further to answer your
> > question. Luckily, this series fixes the situation for Libvirt as well.
> >
> > I was misled by the fact that Libvirt does not show the same symptoms
> > we see in
> > QEMU of this problem, but the bug is there. Here's a quick test with
> > Libvirt with
> > "0" and "1" as snapshot names, considering a VM named with no snapshots,
> > using QEMU 2.12 and Libvirt 4.0.0:
> >
> > - create the "0" snapshot:
> >
> > $ sudo virsh snapshot-create-as --name 0 dhb
> > Domain snapshot 0 created
> >
> > $ sudo virsh snapshot-list dhb
> > Name Creation Time State
> > 
> > 0 2018-09-24 15:47:56 -0400 running
> >
> > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> > List of snapshots present on all disks:
> > ID TAG VM SIZE DATE VM CLOCK
> > --    0  405M 2018-09-24 15:47:56 00:04:20.867
> >
> >
> > - created the "1" snapshot. Here, Libvirt shows both snapshots with
> > snapshot-list,
> > but the snapshot was erased inside QEMU:
> >
> > $ sudo virsh snapshot-create-as --name 1 dhb
> > Domain snapshot 1 created
> > $
> > $ sudo virsh snapshot-list dhb
> > Name Creation Time State
> > 
> > 0 2018-09-24 15:47:56 -0400 running
> > 1 2018-09-24 15:50:09 -0400 running
> >
> > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> > List of snapshots present on all disks:
> > ID TAG VM SIZE DATE VM CLOCK
> > --    1  404M 2018-09-24 15:50:10 00:05:36.226
> >
> >
> > This is where I stopped checking out Libvirt at first, concluding
> > wrongly that it
> > was immune to the bug.
> >
> > Libvirt does not throw an error when trying to apply snapshot 0. In
> > fact, it acts
> > like everything went fine:
> >
> > $ sudo virsh snapshot-revert dhb 0
> >
> > $ echo $?
> > 0
> 
> Is that a libvirt bug?

Probably yes. The error handling from HMP sucks and can't really be
fixed in all cases. This is for the handler which calls "loadvm":

if (strstr(reply, "No block device supports snapshots") != NULL) {
virReportError(VIR_ERR_OPERATION_INVALID, "%s",
   _("this domain does not have a device to load 
snapshots"));
goto cleanup;
} else if (strstr(reply, "Could not find snapshot") != NULL) {
virReportError(VIR_ERR_OPERATION_INVALID,
   _("the snapshot '%s' does not exist, and was not 
loaded"),
   name);
goto cleanup;
} else if (strstr(reply, "Snapshots not supported on device") != NULL) {
virReportError(VIR_ERR_OPERATION_INVALID, "%s", reply);
goto cleanup;
} else if (strstr(reply, "Could not open VM state file") != NULL) {
virReportError(VIR_ERR_OPERATION_FAILED, "%s", reply);
goto cleanup;
} else if (strstr(reply, "Error") != NULL
 && strstr(reply, "while loading VM state") != NULL) {
virReportError(VIR_ERR_OPERATION_FAILED, "%s", reply);
goto cleanup;
} else if (strstr(reply, "Error") != NULL

Re: [libvirt] [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

2018-10-09 Thread Markus Armbruster
Cc: libvir-list for review of the compatibility argument below.

Daniel Henrique Barboza  writes:

> Hey David,
>
> On 9/21/18 9:29 AM, Dr. David Alan Gilbert wrote:
>> * Daniel Henrique Barboza (danielhb...@gmail.com) wrote:
>>> changes in v2:
>>> - removed the "RFC" marker;
>>> - added a new patch (patch 2) that removes
>>> bdrv_snapshot_delete_by_id_or_name from the code;
>>> - made changes in patch 1 as suggested by Murilo;
>>> - previous patch set link:
>>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html
>>>
>>>
>>> It is not uncommon to see bugs being opened by testers that attempt to
>>> create VM snapshots using HMP. It turns out that "0" and "1" are quite
>>> common snapshot names and they trigger a lot of bugs. I gave an example
>>> in the commit message of patch 1, but to sum up here: QEMU treats the
>>> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. It
>>> is documented as such, but this can lead to strange situations.
>>>
>>> Given that it is strange for an API to consider a parameter to be 2 fields
>>> at the same time, and inadvently treating them as one or the other, and
>>> that removing the ID field is too drastic, my idea here is to keep the
>>> ID field for internal control, but do not let the user set it.
>>>
>>> I guess there's room for discussion about considering this change an API
>>> change or not. It doesn't affect users of HMP and it doesn't affect Libvirt,
>>> but simplifying the meaning of the parameters of savevm/loadvm/delvm.
>> Can you clarify a couple of things:
>>a) What is it about libvirt's use that means it's OK ? Does it never
>> use numeric tags?
>
> I am glad you asked because my understanding in how Libvirt was dealing
> with snapshots was wrong, and I just looked into it further to answer your
> question. Luckily, this series fixes the situation for Libvirt as well.
>
> I was misled by the fact that Libvirt does not show the same symptoms
> we see in
> QEMU of this problem, but the bug is there. Here's a quick test with
> Libvirt with
> "0" and "1" as snapshot names, considering a VM named with no snapshots,
> using QEMU 2.12 and Libvirt 4.0.0:
>
> - create the "0" snapshot:
>
> $ sudo virsh snapshot-create-as --name 0 dhb
> Domain snapshot 0 created
>
> $ sudo virsh snapshot-list dhb
> Name Creation Time State
> 
> 0 2018-09-24 15:47:56 -0400 running
>
> $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> List of snapshots present on all disks:
> ID TAG VM SIZE DATE VM CLOCK
> --    0  405M 2018-09-24 15:47:56 00:04:20.867
>
>
> - created the "1" snapshot. Here, Libvirt shows both snapshots with
> snapshot-list,
> but the snapshot was erased inside QEMU:
>
> $ sudo virsh snapshot-create-as --name 1 dhb
> Domain snapshot 1 created
> $
> $ sudo virsh snapshot-list dhb
> Name Creation Time State
> 
> 0 2018-09-24 15:47:56 -0400 running
> 1 2018-09-24 15:50:09 -0400 running
>
> $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> List of snapshots present on all disks:
> ID TAG VM SIZE DATE VM CLOCK
> --    1  404M 2018-09-24 15:50:10 00:05:36.226
>
>
> This is where I stopped checking out Libvirt at first, concluding
> wrongly that it
> was immune to the bug.
>
> Libvirt does not throw an error when trying to apply snapshot 0. In
> fact, it acts
> like everything went fine:
>
> $ sudo virsh snapshot-revert dhb 0
>
> $ echo $?
> 0

Is that a libvirt bug?

> Reverting back to snapshot "1" works as intended, restoring the VM
> state when it
> was created.
>
>
> (perhaps this is something we should let Libvirt be aware of ...)
>
>
>
> This series fixes this behavior because the snapshot 0 isn't erased
> from QEMU, allowing
> Libvirt to successfully restore it.
>
>
>>b) After this series are you always guaranteed to be able to fix
>> any existing oddly named snapshots?
>
> The oddly named snapshots that already exists can be affected by the
> semantic
> change in loadvm and delvm, in a way that the user can't load/del
> using the snap
> ID, just the tag. Aside from that, I don't see any side effects with
> existing
> snapshots and this patch series.

Do all snapshots have a tag that is unique within their image?  Even
snapshots created by old versions of QEMU?

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list