When the 'saferemove' storage configuration option is used, the 'current' top volume of the backing chain is renamed and removal is delegated to a worker task. Thus, the space for that volume is still allocated and cannot be re-used for the new top volume (to be backed by the snapshot volume). With a full enough pool, allocating the new top volume fails. In this case, executing the saferemove worker and removing the 'current' volume is bad, because there won't be any top volume anymore. Fixing that situation would mean manually allocating a new top volume backed by the snapshot afterwards.
Improve the situation by keeping the original top volume around and renaming it back if allocating the new volume failed. Signed-off-by: Fiona Ebner <[email protected]> --- src/PVE/Storage/LVMPlugin.pm | 41 ++++++++++++++++++------------------ 1 file changed, 20 insertions(+), 21 deletions(-) diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm index 32a8339..987e60b 100644 --- a/src/PVE/Storage/LVMPlugin.pm +++ b/src/PVE/Storage/LVMPlugin.pm @@ -1117,46 +1117,45 @@ sub volume_rollback_is_possible { } my sub volume_snapshot_rollback_locked { - my ($class, $scfg, $storeid, $volname, $snap, $cleanup_worker) = @_; + my ($class, $scfg, $storeid, $volname, $snap) = @_; my $format = ($class->parse_volname($volname))[6]; die "can't rollback snapshot for '$format' volume\n" if $format ne 'qcow2'; - $cleanup_worker->$* = eval { free_snap_image($class, $storeid, $scfg, $volname, 'current'); }; + my $cleanup_worker = eval { free_snap_image($class, $storeid, $scfg, $volname, 'current'); }; die "error deleting snapshot $snap $@\n" if $@; eval { alloc_snap_image($class, $storeid, $scfg, $volname, $snap) }; - die "can't allocate new volume $volname: $@\n" if $@; + if (my $err = $@) { + if ($cleanup_worker) { # rename original image back + eval { + my $vg = $scfg->{vgname}; + my $cmd = ['lvrename', $vg, "del-${volname}", $volname]; + run_command($cmd, errmsg => "lvrename '${vg}/del-${volname}' error"); + }; + warn $@ if $@; + } + die "can't allocate new volume $volname: $err\n"; + } - return undef; + return $cleanup_worker; } sub volume_snapshot_rollback { my ($class, $scfg, $storeid, $volname, $snap) = @_; - my $cleanup_worker; - - eval { - $class->cluster_lock_storage( - $storeid, - $scfg->{shared}, - undef, - sub { - volume_snapshot_rollback_locked( - $class, $scfg, $storeid, $volname, $snap, \$cleanup_worker, - ); - }, - ); - }; - my $err = $@; + my $cleanup_worker = $class->cluster_lock_storage( + $storeid, + $scfg->{shared}, + undef, + sub { volume_snapshot_rollback_locked($class, $scfg, $storeid, $volname, $snap); }, + ); # Spawn outside of the locked section, because with 'saferemove', the cleanup worker also needs # to obtain the lock, and in CLI context, it will be awaited synchronously, see fork_worker(). fork_cleanup_worker($cleanup_worker); - die $err if $err; - return; } -- 2.47.3
