Re: [libvirt] [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore
On Thu, Oct 11, 2018 at 13:06:14 +0100, Dr. David Alan Gilbert wrote: > * Peter Krempa (pkre...@redhat.com) wrote: > > On Tue, Oct 09, 2018 at 19:34:43 +0200, Markus Armbruster wrote: > > > Cc: libvir-list for review of the compatibility argument below. > > > > > > Daniel Henrique Barboza writes: > > > > > > > Hey David, > > > > > > > > On 9/21/18 9:29 AM, Dr. David Alan Gilbert wrote: > > > >> * Daniel Henrique Barboza (danielhb...@gmail.com) wrote: > > > >>> changes in v2: > > > >>> - removed the "RFC" marker; > > > >>> - added a new patch (patch 2) that removes > > > >>> bdrv_snapshot_delete_by_id_or_name from the code; > > > >>> - made changes in patch 1 as suggested by Murilo; > > > >>> - previous patch set link: > > > >>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html > > > >>> > > > >>> > > > >>> It is not uncommon to see bugs being opened by testers that attempt to > > > >>> create VM snapshots using HMP. It turns out that "0" and "1" are quite > > > >>> common snapshot names and they trigger a lot of bugs. I gave an > > > >>> example > > > >>> in the commit message of patch 1, but to sum up here: QEMU treats the > > > >>> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. > > > >>> It > > > >>> is documented as such, but this can lead to strange situations. > > > >>> > > > >>> Given that it is strange for an API to consider a parameter to be 2 > > > >>> fields > > > >>> at the same time, and inadvently treating them as one or the other, > > > >>> and > > > >>> that removing the ID field is too drastic, my idea here is to keep the > > > >>> ID field for internal control, but do not let the user set it. > > > >>> > > > >>> I guess there's room for discussion about considering this change an > > > >>> API > > > >>> change or not. It doesn't affect users of HMP and it doesn't affect > > > >>> Libvirt, > > > >>> but simplifying the meaning of the parameters of savevm/loadvm/delvm. > > > >> Can you clarify a couple of things: > > > >>a) What is it about libvirt's use that means it's OK ? Does it never > > > >> use numeric tags? > > > > > > > > I am glad you asked because my understanding in how Libvirt was dealing > > > > with snapshots was wrong, and I just looked into it further to answer > > > > your > > > > question. Luckily, this series fixes the situation for Libvirt as well. > > > > > > > > I was misled by the fact that Libvirt does not show the same symptoms > > > > we see in > > > > QEMU of this problem, but the bug is there. Here's a quick test with > > > > Libvirt with > > > > "0" and "1" as snapshot names, considering a VM named with no snapshots, > > > > using QEMU 2.12 and Libvirt 4.0.0: > > > > > > > > - create the "0" snapshot: > > > > > > > > $ sudo virsh snapshot-create-as --name 0 dhb > > > > Domain snapshot 0 created > > > > > > > > $ sudo virsh snapshot-list dhb > > > > Name Creation Time State > > > > > > > > 0 2018-09-24 15:47:56 -0400 running > > > > > > > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots > > > > List of snapshots present on all disks: > > > > ID TAG VM SIZE DATE VM CLOCK > > > > -- 0 405M 2018-09-24 15:47:56 00:04:20.867 > > > > > > > > > > > > - created the "1" snapshot. Here, Libvirt shows both snapshots with > > > > snapshot-list, > > > > but the snapshot was erased inside QEMU: > > > > > > > > $ sudo virsh snapshot-create-as --name 1 dhb > > > > Domain snapshot 1 created > > > > $ > > > > $ sudo virsh snapshot-list dhb > > > > Name Creation Time State > > > > > > > > 0 2018-09-24 15:47:56 -0400 running > > > > 1 2018-09-24 15:50:09 -0400 running > > > > > > > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots > > > > List of snapshots present on all disks: > > > > ID TAG VM SIZE DATE VM CLOCK > > > > -- 1 404M 2018-09-24 15:50:10 00:05:36.226 > > > > > > > > > > > > This is where I stopped checking out Libvirt at first, concluding > > > > wrongly that it > > > > was immune to the bug. > > > > > > > > Libvirt does not throw an error when trying to apply snapshot 0. In > > > > fact, it acts > > > > like everything went fine: > > > > > > > > $ sudo virsh snapshot-revert dhb 0 > > > > > > > > $ echo $? > > > > 0 > > > > > > Is that a libvirt bug? > > > > Probably yes. The error handling from HMP sucks and can't really be > > fixed in all cases. This is for the handler which calls "loadvm": > > > > if (strstr(reply, "No block device supports snapshots") != NULL) { > > virReportError(VIR_ERR_OPERATION_INVALID, "%s", > >_("this domain does not have a device to load > > snapshots")); > > goto cleanup; > > } else if (strstr(reply, "Could not find snapshot") != NULL) { > > virReportError(VIR_ERR_OPERATION_INVALID, > >_("the snapshot '%s' does not exist, and was not > > loaded"), > >
Re: [libvirt] [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore
* Peter Krempa (pkre...@redhat.com) wrote: > On Tue, Oct 09, 2018 at 19:34:43 +0200, Markus Armbruster wrote: > > Cc: libvir-list for review of the compatibility argument below. > > > > Daniel Henrique Barboza writes: > > > > > Hey David, > > > > > > On 9/21/18 9:29 AM, Dr. David Alan Gilbert wrote: > > >> * Daniel Henrique Barboza (danielhb...@gmail.com) wrote: > > >>> changes in v2: > > >>> - removed the "RFC" marker; > > >>> - added a new patch (patch 2) that removes > > >>> bdrv_snapshot_delete_by_id_or_name from the code; > > >>> - made changes in patch 1 as suggested by Murilo; > > >>> - previous patch set link: > > >>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html > > >>> > > >>> > > >>> It is not uncommon to see bugs being opened by testers that attempt to > > >>> create VM snapshots using HMP. It turns out that "0" and "1" are quite > > >>> common snapshot names and they trigger a lot of bugs. I gave an example > > >>> in the commit message of patch 1, but to sum up here: QEMU treats the > > >>> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. It > > >>> is documented as such, but this can lead to strange situations. > > >>> > > >>> Given that it is strange for an API to consider a parameter to be 2 > > >>> fields > > >>> at the same time, and inadvently treating them as one or the other, and > > >>> that removing the ID field is too drastic, my idea here is to keep the > > >>> ID field for internal control, but do not let the user set it. > > >>> > > >>> I guess there's room for discussion about considering this change an API > > >>> change or not. It doesn't affect users of HMP and it doesn't affect > > >>> Libvirt, > > >>> but simplifying the meaning of the parameters of savevm/loadvm/delvm. > > >> Can you clarify a couple of things: > > >>a) What is it about libvirt's use that means it's OK ? Does it never > > >> use numeric tags? > > > > > > I am glad you asked because my understanding in how Libvirt was dealing > > > with snapshots was wrong, and I just looked into it further to answer your > > > question. Luckily, this series fixes the situation for Libvirt as well. > > > > > > I was misled by the fact that Libvirt does not show the same symptoms > > > we see in > > > QEMU of this problem, but the bug is there. Here's a quick test with > > > Libvirt with > > > "0" and "1" as snapshot names, considering a VM named with no snapshots, > > > using QEMU 2.12 and Libvirt 4.0.0: > > > > > > - create the "0" snapshot: > > > > > > $ sudo virsh snapshot-create-as --name 0 dhb > > > Domain snapshot 0 created > > > > > > $ sudo virsh snapshot-list dhb > > > Name Creation Time State > > > > > > 0 2018-09-24 15:47:56 -0400 running > > > > > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots > > > List of snapshots present on all disks: > > > ID TAG VM SIZE DATE VM CLOCK > > > -- 0 405M 2018-09-24 15:47:56 00:04:20.867 > > > > > > > > > - created the "1" snapshot. Here, Libvirt shows both snapshots with > > > snapshot-list, > > > but the snapshot was erased inside QEMU: > > > > > > $ sudo virsh snapshot-create-as --name 1 dhb > > > Domain snapshot 1 created > > > $ > > > $ sudo virsh snapshot-list dhb > > > Name Creation Time State > > > > > > 0 2018-09-24 15:47:56 -0400 running > > > 1 2018-09-24 15:50:09 -0400 running > > > > > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots > > > List of snapshots present on all disks: > > > ID TAG VM SIZE DATE VM CLOCK > > > -- 1 404M 2018-09-24 15:50:10 00:05:36.226 > > > > > > > > > This is where I stopped checking out Libvirt at first, concluding > > > wrongly that it > > > was immune to the bug. > > > > > > Libvirt does not throw an error when trying to apply snapshot 0. In > > > fact, it acts > > > like everything went fine: > > > > > > $ sudo virsh snapshot-revert dhb 0 > > > > > > $ echo $? > > > 0 > > > > Is that a libvirt bug? > > Probably yes. The error handling from HMP sucks and can't really be > fixed in all cases. This is for the handler which calls "loadvm": > > if (strstr(reply, "No block device supports snapshots") != NULL) { > virReportError(VIR_ERR_OPERATION_INVALID, "%s", >_("this domain does not have a device to load > snapshots")); > goto cleanup; > } else if (strstr(reply, "Could not find snapshot") != NULL) { > virReportError(VIR_ERR_OPERATION_INVALID, >_("the snapshot '%s' does not exist, and was not > loaded"), >name); > goto cleanup; > } else if (strstr(reply, "Snapshots not supported on device") != NULL) { > virReportError(VIR_ERR_OPERATION_INVALID, "%s", reply); > goto cleanup; > } else if (strstr(reply, "Could not open VM state file") != NULL) { > virReportError(VIR_ERR_OPERATION_FAILED, "%s"
Re: [libvirt] [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore
On Tue, Oct 09, 2018 at 19:34:43 +0200, Markus Armbruster wrote: > Cc: libvir-list for review of the compatibility argument below. > > Daniel Henrique Barboza writes: > > > Hey David, > > > > On 9/21/18 9:29 AM, Dr. David Alan Gilbert wrote: > >> * Daniel Henrique Barboza (danielhb...@gmail.com) wrote: > >>> changes in v2: > >>> - removed the "RFC" marker; > >>> - added a new patch (patch 2) that removes > >>> bdrv_snapshot_delete_by_id_or_name from the code; > >>> - made changes in patch 1 as suggested by Murilo; > >>> - previous patch set link: > >>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html > >>> > >>> > >>> It is not uncommon to see bugs being opened by testers that attempt to > >>> create VM snapshots using HMP. It turns out that "0" and "1" are quite > >>> common snapshot names and they trigger a lot of bugs. I gave an example > >>> in the commit message of patch 1, but to sum up here: QEMU treats the > >>> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. It > >>> is documented as such, but this can lead to strange situations. > >>> > >>> Given that it is strange for an API to consider a parameter to be 2 fields > >>> at the same time, and inadvently treating them as one or the other, and > >>> that removing the ID field is too drastic, my idea here is to keep the > >>> ID field for internal control, but do not let the user set it. > >>> > >>> I guess there's room for discussion about considering this change an API > >>> change or not. It doesn't affect users of HMP and it doesn't affect > >>> Libvirt, > >>> but simplifying the meaning of the parameters of savevm/loadvm/delvm. > >> Can you clarify a couple of things: > >>a) What is it about libvirt's use that means it's OK ? Does it never > >> use numeric tags? > > > > I am glad you asked because my understanding in how Libvirt was dealing > > with snapshots was wrong, and I just looked into it further to answer your > > question. Luckily, this series fixes the situation for Libvirt as well. > > > > I was misled by the fact that Libvirt does not show the same symptoms > > we see in > > QEMU of this problem, but the bug is there. Here's a quick test with > > Libvirt with > > "0" and "1" as snapshot names, considering a VM named with no snapshots, > > using QEMU 2.12 and Libvirt 4.0.0: > > > > - create the "0" snapshot: > > > > $ sudo virsh snapshot-create-as --name 0 dhb > > Domain snapshot 0 created > > > > $ sudo virsh snapshot-list dhb > > Name Creation Time State > > > > 0 2018-09-24 15:47:56 -0400 running > > > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots > > List of snapshots present on all disks: > > ID TAG VM SIZE DATE VM CLOCK > > -- 0 405M 2018-09-24 15:47:56 00:04:20.867 > > > > > > - created the "1" snapshot. Here, Libvirt shows both snapshots with > > snapshot-list, > > but the snapshot was erased inside QEMU: > > > > $ sudo virsh snapshot-create-as --name 1 dhb > > Domain snapshot 1 created > > $ > > $ sudo virsh snapshot-list dhb > > Name Creation Time State > > > > 0 2018-09-24 15:47:56 -0400 running > > 1 2018-09-24 15:50:09 -0400 running > > > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots > > List of snapshots present on all disks: > > ID TAG VM SIZE DATE VM CLOCK > > -- 1 404M 2018-09-24 15:50:10 00:05:36.226 > > > > > > This is where I stopped checking out Libvirt at first, concluding > > wrongly that it > > was immune to the bug. > > > > Libvirt does not throw an error when trying to apply snapshot 0. In > > fact, it acts > > like everything went fine: > > > > $ sudo virsh snapshot-revert dhb 0 > > > > $ echo $? > > 0 > > Is that a libvirt bug? Probably yes. The error handling from HMP sucks and can't really be fixed in all cases. This is for the handler which calls "loadvm": if (strstr(reply, "No block device supports snapshots") != NULL) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("this domain does not have a device to load snapshots")); goto cleanup; } else if (strstr(reply, "Could not find snapshot") != NULL) { virReportError(VIR_ERR_OPERATION_INVALID, _("the snapshot '%s' does not exist, and was not loaded"), name); goto cleanup; } else if (strstr(reply, "Snapshots not supported on device") != NULL) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", reply); goto cleanup; } else if (strstr(reply, "Could not open VM state file") != NULL) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", reply); goto cleanup; } else if (strstr(reply, "Error") != NULL && strstr(reply, "while loading VM state") != NULL) { virReportError(VIR_ERR_OPERATION_FAILED, "%s", reply); goto cleanup; } else if (strstr(reply, "Error") != NULL
Re: [libvirt] [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore
Cc: libvir-list for review of the compatibility argument below. Daniel Henrique Barboza writes: > Hey David, > > On 9/21/18 9:29 AM, Dr. David Alan Gilbert wrote: >> * Daniel Henrique Barboza (danielhb...@gmail.com) wrote: >>> changes in v2: >>> - removed the "RFC" marker; >>> - added a new patch (patch 2) that removes >>> bdrv_snapshot_delete_by_id_or_name from the code; >>> - made changes in patch 1 as suggested by Murilo; >>> - previous patch set link: >>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html >>> >>> >>> It is not uncommon to see bugs being opened by testers that attempt to >>> create VM snapshots using HMP. It turns out that "0" and "1" are quite >>> common snapshot names and they trigger a lot of bugs. I gave an example >>> in the commit message of patch 1, but to sum up here: QEMU treats the >>> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. It >>> is documented as such, but this can lead to strange situations. >>> >>> Given that it is strange for an API to consider a parameter to be 2 fields >>> at the same time, and inadvently treating them as one or the other, and >>> that removing the ID field is too drastic, my idea here is to keep the >>> ID field for internal control, but do not let the user set it. >>> >>> I guess there's room for discussion about considering this change an API >>> change or not. It doesn't affect users of HMP and it doesn't affect Libvirt, >>> but simplifying the meaning of the parameters of savevm/loadvm/delvm. >> Can you clarify a couple of things: >>a) What is it about libvirt's use that means it's OK ? Does it never >> use numeric tags? > > I am glad you asked because my understanding in how Libvirt was dealing > with snapshots was wrong, and I just looked into it further to answer your > question. Luckily, this series fixes the situation for Libvirt as well. > > I was misled by the fact that Libvirt does not show the same symptoms > we see in > QEMU of this problem, but the bug is there. Here's a quick test with > Libvirt with > "0" and "1" as snapshot names, considering a VM named with no snapshots, > using QEMU 2.12 and Libvirt 4.0.0: > > - create the "0" snapshot: > > $ sudo virsh snapshot-create-as --name 0 dhb > Domain snapshot 0 created > > $ sudo virsh snapshot-list dhb > Name Creation Time State > > 0 2018-09-24 15:47:56 -0400 running > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots > List of snapshots present on all disks: > ID TAG VM SIZE DATE VM CLOCK > -- 0 405M 2018-09-24 15:47:56 00:04:20.867 > > > - created the "1" snapshot. Here, Libvirt shows both snapshots with > snapshot-list, > but the snapshot was erased inside QEMU: > > $ sudo virsh snapshot-create-as --name 1 dhb > Domain snapshot 1 created > $ > $ sudo virsh snapshot-list dhb > Name Creation Time State > > 0 2018-09-24 15:47:56 -0400 running > 1 2018-09-24 15:50:09 -0400 running > > $ sudo virsh qemu-monitor-command dhb --hmp info snapshots > List of snapshots present on all disks: > ID TAG VM SIZE DATE VM CLOCK > -- 1 404M 2018-09-24 15:50:10 00:05:36.226 > > > This is where I stopped checking out Libvirt at first, concluding > wrongly that it > was immune to the bug. > > Libvirt does not throw an error when trying to apply snapshot 0. In > fact, it acts > like everything went fine: > > $ sudo virsh snapshot-revert dhb 0 > > $ echo $? > 0 Is that a libvirt bug? > Reverting back to snapshot "1" works as intended, restoring the VM > state when it > was created. > > > (perhaps this is something we should let Libvirt be aware of ...) > > > > This series fixes this behavior because the snapshot 0 isn't erased > from QEMU, allowing > Libvirt to successfully restore it. > > >>b) After this series are you always guaranteed to be able to fix >> any existing oddly named snapshots? > > The oddly named snapshots that already exists can be affected by the > semantic > change in loadvm and delvm, in a way that the user can't load/del > using the snap > ID, just the tag. Aside from that, I don't see any side effects with > existing > snapshots and this patch series. Do all snapshots have a tag that is unique within their image? Even snapshots created by old versions of QEMU? -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list