On Thu, Aug 14, 2025 at 11:24:48AM +0200, Alexandre Derumier via pve-devel wrote: > From: Alexandre Derumier <alexandre.derum...@groupe-cyllene.com> > To: pve-devel@lists.proxmox.com > Subject: [PATCH pve-storage] lvm: use blkdiscard instead cstream to > saferemove drive > Date: Thu, 14 Aug 2025 11:24:48 +0200 > Message-ID: <20250814092448.919114-1-alexandre.derum...@groupe-cyllene.com> > X-Mailer: git-send-email 2.47.2 > > Current cstream implementation is pretty slow as hell, even without > throttling. > > use blkdiscard --zeroout instead, which is lot of magnetude faster
Makes sense, given that it probably uses dedicated zero-out `ioctls()`. > > Another benefit is that blkdiscard is skipping already zeroed block, so for > empty > temp images like snapshot, is pretty fast. > > blkdiscard don't have throttling like cstream, but we can tune the step size > of zeroes pushed to the storage. > I'm using 32MB stepsize by default , like ovirt, where it seem to be the best > balance between speed and load. > https://github.com/oVirt/vdsm/commit/79f1d79058aad863ca4b6672d4a5ce2be8e48986 > > but it can be reduce with "saferemove_stepsize" option. > > Also adding an option "saferemove_discard", to use discard instead zeroing. > > test with a 100G volume (empty): > > time /usr/bin/cstream -i /dev/zero -o /dev/test/vm-100-disk-0.qcow2 -T 10 -v > 1 -b 1048576 > > 13561233408 B 12.6 GB 10.00 s 1356062979 B/s 1.26 GB/s > 26021462016 B 24.2 GB 20.00 s 1301029969 B/s 1.21 GB/s > 38585499648 B 35.9 GB 30.00 s 1286135343 B/s 1.20 GB/s > 50998542336 B 47.5 GB 40.00 s 1274925312 B/s 1.19 GB/s > 63702765568 B 59.3 GB 50.00 s 1274009877 B/s 1.19 GB/s > 76721885184 B 71.5 GB 60.00 s 1278640698 B/s 1.19 GB/s > 89126539264 B 83.0 GB 70.00 s 1273178488 B/s 1.19 GB/s > 101666459648 B 94.7 GB 80.00 s 1270779024 B/s 1.18 GB/s > 107390959616 B 100.0 GB 84.39 s 1272531142 B/s 1.19 GB/s > write: No space left on device > > real 1m24.394s > user 0m0.171s > sys 1m24.052s > > time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v > /dev/test/vm-100-disk-0.qcow2: Zero-filled 107390959616 bytes from the offset > 0 > > real 0m3.641s > user 0m0.001s > sys 0m3.433s > > test with a 100G volume with random data: > > time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v > > /dev/test/vm-112-disk-1: Zero-filled 4764729344 bytes from the offset 0 > /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset > 4764729344 > /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset > 9428795392 > /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset > 14260633600 > /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset > 19092471808 > /dev/test/vm-112-disk-1: Zero-filled 4865392640 bytes from the offset > 23924310016 > /dev/test/vm-112-disk-1: Zero-filled 4596957184 bytes from the offset > 28789702656 > /dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset > 33386659840 > /dev/test/vm-112-disk-1: Zero-filled 4294967296 bytes from the offset > 38117834752 > /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset > 42412802048 > /dev/test/vm-112-disk-1: Zero-filled 4697620480 bytes from the offset > 47076868096 > /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset > 51774488576 > /dev/test/vm-112-disk-1: Zero-filled 4261412864 bytes from the offset > 56438554624 > /dev/test/vm-112-disk-1: Zero-filled 4362076160 bytes from the offset > 60699967488 > /dev/test/vm-112-disk-1: Zero-filled 4127195136 bytes from the offset > 65062043648 > /dev/test/vm-112-disk-1: Zero-filled 4328521728 bytes from the offset > 69189238784 > /dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset > 73517760512 > /dev/test/vm-112-disk-1: Zero-filled 4026531840 bytes from the offset > 78248935424 > /dev/test/vm-112-disk-1: Zero-filled 4194304000 bytes from the offset > 82275467264 > /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset > 86469771264 > /dev/test/vm-112-disk-1: Zero-filled 4395630592 bytes from the offset > 91133837312 > /dev/test/vm-112-disk-1: Zero-filled 3623878656 bytes from the offset > 95529467904 > /dev/test/vm-112-disk-1: Zero-filled 4462739456 bytes from the offset > 99153346560 > /dev/test/vm-112-disk-1: Zero-filled 3758096384 bytes from the offset > 103616086016 > > real 0m23.969s > user 0m0.030s > sys 0m0.144s > > Signed-off-by: Alexandre Derumier <alexandre.derum...@groupe-cyllene.com> > --- > src/PVE/Storage/LVMPlugin.pm | 43 ++++++++++++------------------------ > 1 file changed, 14 insertions(+), 29 deletions(-) > > diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm > index 0416c9e..5ea9d8b 100644 > --- a/src/PVE/Storage/LVMPlugin.pm > +++ b/src/PVE/Storage/LVMPlugin.pm > @@ -287,35 +287,15 @@ my sub free_lvm_volumes { > # we need to zero out LVM data for security reasons > # and to allow thin provisioning > my $zero_out_worker = sub { > - # wipe throughput up to 10MB/s by default; may be overwritten with > saferemove_throughput > - my $throughput = '-10485760'; > - if ($scfg->{saferemove_throughput}) { > - $throughput = $scfg->{saferemove_throughput}; > - } > for my $name (@$volnames) { > print "zero-out data on image $name (/dev/$vg/del-$name)\n"; > - > + my $stepsize = $scfg->{saferemove_stepsize} // 32; > my $cmd = [ > - '/usr/bin/cstream', > - '-i', > - '/dev/zero', > - '-o', > - "/dev/$vg/del-$name", > - '-T', > - '10', > - '-v', > - '1', > - '-b', > - '1048576', > - '-t', > - "$throughput", > + '/usr/sbin/blkdiscard', "/dev/$vg/del-$name", '-v', > '--step', "${stepsize}M", > ]; > - eval { > - run_command( > - $cmd, > - errmsg => "zero out finished (note: 'No space left on > device' is ok here)", > - ); > - }; > + push @$cmd, '--zeroout' if !$scfg->{saferemove_discard}; > + > + eval { run_command($cmd); }; > warn $@ if $@; > > $class->cluster_lock_storage( > @@ -376,9 +356,13 @@ sub properties { > description => "Zero-out data when removing LVs.", > type => 'boolean', > }, > - saferemove_throughput => { > - description => "Wipe throughput (cstream -t parameter value).", > - type => 'string', We do still need to keep the old option around to not break existing configs. We should document its deprecation and add a warning if it is set. > + saferemove_stepsize => { > + description => "Wipe step size (default 32MB).", > + enum => [qw(1 2 4 8 16 32)], > + }, > + saferemove_discard => { > + description => "Wipe with discard instead zeroing.", > + type => 'boolean', Not sure we need this (not sure when this is actually useful), but it's cheap enough to have around. Should add a `default => 0` for documentation purposes, though. > }, > tagged_only => { > description => "Only use logical volumes tagged with > 'pve-vm-ID'.", > @@ -394,7 +378,8 @@ sub options { > shared => { optional => 1 }, > disable => { optional => 1 }, > saferemove => { optional => 1 }, > - saferemove_throughput => { optional => 1 }, > + saferemove_discard => { optional => 1 }, > + saferemove_stepsize => { optional => 1 }, > content => { optional => 1 }, > base => { fixed => 1, optional => 1 }, > tagged_only => { optional => 1 }, > -- > 2.47.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel