On 07/11/2025 18:19, Friedrich Weber wrote: > Currently, cloning an EFI disk off a snapshot-as-volume-chain snapshot > fails with "qemu-img: Failed to load snapshot: Can't find snapshot". > The reason is that the special case for EFI disks calls `qemu-img dd` > with the additional `-l snapname` option, which is only valid for > qcow2-internal snapshots. For snapshot-as-volume-chain snapshots, the > source volume is already the volume corresponding to the snapshot. > > Fix this by checking whether the snapshot in question is an external > snapshot, and if it is, omitting the `-l` option. > > Reported in enterprise support.
Turns out this only fixes the issue in case of snapshot-as-volume-chain on file-level storages, not on LVM. The reason is that LVMPlugin::volume_snapshot_info does not set the `ext` key to 1 [1], so the code doesn't recognize the external snapshot. LVMPlugin::volume_snapshot_info should probably set `ext` (assuming that "ext" means "external")? > > Signed-off-by: Friedrich Weber <[email protected]> > --- > > Notes: > I noticed that two oddities about volume_snapshot_info: > > - it doesn't seem to work for qcow2-internal snapshots: It assumes > that if `qemu img info` returns a JSON object, the snapshots are > internal, and if it returns a JSON array, they are "external", but > this doesn't seem correct -- I think because we pass > --backing-chain, qemu-img will return an one-element JSON array even > for internal snapshots. Hence, volume_snapshot_info incorrectly > returns a hash ref with only a single member "current" if the > snapshots are internal. But since I only found callsites guarded > with snapshot-as-volume-chain, I guess this didn't hurt until now? > > It doesn't hurt for my patch either because it only checks for > existence of $info->{$snapshot}->{ext} (which also works for > qcow2-internal snapshots, because alreay $info->{$snapshot} will not > exist then), but that's admittedly not nice and it's probably a good > idea to fix volume_snapshot_info in addition. > > - but in which way should we fix it? Right now, for volume-chain > snapshots, it seems like volume_snapshot_info will always have a > member "current" pointing to the "actual" volume, even if the volume > doesn't have any snapshots. Is this intended and do we want this > also in case of qcow2-internal snapshots? From reading > ZFSPoolPlugin::volume_snapshot_info, it looks like that one will > return an empty hash (without "current") if the volume has no > snapshots. Looks like - ZFSPoolPlugin::volume_snapshot_info indeed doesn't return "current" - but there are several callsites of {LVM,}Plugin::volume_snapshot_info that rely on 'current' being present for snapshot-as-volume chain [2] So the implementations differ w.r.t. returning a "current" entry or not, which is IMO a bit awkward API-wise. But fixing this in either way (removing it for {LVM,}Plugin or adding it for ZFSPoolPlugin) would probably require bigger changes in either the snapshot-as-volume-chain or the ZFS replication code? For fixing this particular bug, I guess we could instead call `volume_qemu_snapshot_method` and only add the `-l snapname` parameter if that returns 'qemu'. But I think this is only safe because we don't allow enabling/disabling snapshot-as-volume-chain for file-level storages on-the-fly. What do you think? [1] https://git.proxmox.com/?p=pve-storage.git;a=blob;f=src/PVE/Storage/LVMPlugin.pm;h=3badfef2c;hb=a85ebe3#l849 [2] https://git.proxmox.com/?p=qemu-server.git;a=blob;f=src/PVE/QemuServer/Blockdev.pm;h=8fa5eb51e;hb=b41c1e1440ff822#l420 _______________________________________________ pve-devel mailing list [email protected] https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
