On Thu, Aug 14, 2025 at 11:24:48AM +0200, Alexandre Derumier via pve-devel 
wrote:
> From: Alexandre Derumier <alexandre.derum...@groupe-cyllene.com>
> To: pve-devel@lists.proxmox.com
> Subject: [PATCH pve-storage] lvm: use blkdiscard instead cstream to
>  saferemove drive
> Date: Thu, 14 Aug 2025 11:24:48 +0200
> Message-ID: <20250814092448.919114-1-alexandre.derum...@groupe-cyllene.com>
> X-Mailer: git-send-email 2.47.2
> 
> Current cstream implementation is pretty slow as hell, even without 
> throttling.
> 
> use blkdiscard --zeroout instead, which is lot of magnetude faster

Makes sense, given that it probably uses dedicated zero-out `ioctls()`.

> 
> Another benefit is that blkdiscard is skipping already zeroed block, so for 
> empty
> temp images like snapshot, is pretty fast.
> 
> blkdiscard don't have throttling like cstream, but we can tune the step size
> of zeroes pushed to the storage.
> I'm using 32MB stepsize by default , like ovirt, where it seem to be the best
> balance between speed and load.
> https://github.com/oVirt/vdsm/commit/79f1d79058aad863ca4b6672d4a5ce2be8e48986
> 
> but it can be reduce with "saferemove_stepsize" option.
> 
> Also adding an option "saferemove_discard", to use discard instead zeroing.
> 
> test with a 100G volume (empty):
> 
> time /usr/bin/cstream -i /dev/zero -o /dev/test/vm-100-disk-0.qcow2 -T 10 -v 
> 1 -b 1048576
> 
> 13561233408 B 12.6 GB 10.00 s 1356062979 B/s 1.26 GB/s
> 26021462016 B 24.2 GB 20.00 s 1301029969 B/s 1.21 GB/s
> 38585499648 B 35.9 GB 30.00 s 1286135343 B/s 1.20 GB/s
> 50998542336 B 47.5 GB 40.00 s 1274925312 B/s 1.19 GB/s
> 63702765568 B 59.3 GB 50.00 s 1274009877 B/s 1.19 GB/s
> 76721885184 B 71.5 GB 60.00 s 1278640698 B/s 1.19 GB/s
> 89126539264 B 83.0 GB 70.00 s 1273178488 B/s 1.19 GB/s
> 101666459648 B 94.7 GB 80.00 s 1270779024 B/s 1.18 GB/s
> 107390959616 B 100.0 GB 84.39 s 1272531142 B/s 1.19 GB/s
> write: No space left on device
> 
> real    1m24.394s
> user    0m0.171s
> sys     1m24.052s
> 
> time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v
> /dev/test/vm-100-disk-0.qcow2: Zero-filled 107390959616 bytes from the offset > 0
> 
> real    0m3.641s
> user    0m0.001s
> sys     0m3.433s
> 
> test with a 100G volume with random data:
> 
> time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v
> 
> /dev/test/vm-112-disk-1: Zero-filled 4764729344 bytes from the offset 0
> /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 
> 4764729344
> /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 
> 9428795392
> /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 
> 14260633600
> /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 
> 19092471808
> /dev/test/vm-112-disk-1: Zero-filled 4865392640 bytes from the offset 
> 23924310016
> /dev/test/vm-112-disk-1: Zero-filled 4596957184 bytes from the offset 
> 28789702656
> /dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset 
> 33386659840
> /dev/test/vm-112-disk-1: Zero-filled 4294967296 bytes from the offset 
> 38117834752
> /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 
> 42412802048
> /dev/test/vm-112-disk-1: Zero-filled 4697620480 bytes from the offset 
> 47076868096
> /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 
> 51774488576
> /dev/test/vm-112-disk-1: Zero-filled 4261412864 bytes from the offset 
> 56438554624
> /dev/test/vm-112-disk-1: Zero-filled 4362076160 bytes from the offset 
> 60699967488
> /dev/test/vm-112-disk-1: Zero-filled 4127195136 bytes from the offset 
> 65062043648
> /dev/test/vm-112-disk-1: Zero-filled 4328521728 bytes from the offset 
> 69189238784
> /dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset 
> 73517760512
> /dev/test/vm-112-disk-1: Zero-filled 4026531840 bytes from the offset 
> 78248935424
> /dev/test/vm-112-disk-1: Zero-filled 4194304000 bytes from the offset 
> 82275467264
> /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 
> 86469771264
> /dev/test/vm-112-disk-1: Zero-filled 4395630592 bytes from the offset 
> 91133837312
> /dev/test/vm-112-disk-1: Zero-filled 3623878656 bytes from the offset 
> 95529467904
> /dev/test/vm-112-disk-1: Zero-filled 4462739456 bytes from the offset 
> 99153346560
> /dev/test/vm-112-disk-1: Zero-filled 3758096384 bytes from the offset 
> 103616086016
> 
> real    0m23.969s
> user    0m0.030s
> sys     0m0.144s
> 
> Signed-off-by: Alexandre Derumier <alexandre.derum...@groupe-cyllene.com>
> ---
>  src/PVE/Storage/LVMPlugin.pm | 43 ++++++++++++------------------------
>  1 file changed, 14 insertions(+), 29 deletions(-)
> 
> diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm
> index 0416c9e..5ea9d8b 100644
> --- a/src/PVE/Storage/LVMPlugin.pm
> +++ b/src/PVE/Storage/LVMPlugin.pm
> @@ -287,35 +287,15 @@ my sub free_lvm_volumes {
>      # we need to zero out LVM data for security reasons
>      # and to allow thin provisioning
>      my $zero_out_worker = sub {
> -        # wipe throughput up to 10MB/s by default; may be overwritten with 
> saferemove_throughput
> -        my $throughput = '-10485760';
> -        if ($scfg->{saferemove_throughput}) {
> -            $throughput = $scfg->{saferemove_throughput};
> -        }
>          for my $name (@$volnames) {
>              print "zero-out data on image $name (/dev/$vg/del-$name)\n";
> -
> +            my $stepsize = $scfg->{saferemove_stepsize} // 32;
>              my $cmd = [
> -                '/usr/bin/cstream',
> -                '-i',
> -                '/dev/zero',
> -                '-o',
> -                "/dev/$vg/del-$name",
> -                '-T',
> -                '10',
> -                '-v',
> -                '1',
> -                '-b',
> -                '1048576',
> -                '-t',
> -                "$throughput",
> +                '/usr/sbin/blkdiscard', "/dev/$vg/del-$name", '-v', 
> '--step', "${stepsize}M",
>              ];
> -            eval {
> -                run_command(
> -                    $cmd,
> -                    errmsg => "zero out finished (note: 'No space left on 
> device' is ok here)",
> -                );
> -            };
> +            push @$cmd, '--zeroout' if !$scfg->{saferemove_discard};
> +
> +            eval { run_command($cmd); };
>              warn $@ if $@;
>  
>              $class->cluster_lock_storage(
> @@ -376,9 +356,13 @@ sub properties {
>              description => "Zero-out data when removing LVs.",
>              type => 'boolean',
>          },
> -        saferemove_throughput => {
> -            description => "Wipe throughput (cstream -t parameter value).",
> -            type => 'string',

We do still need to keep the old option around to not break existing
configs. We should document its deprecation and add a warning if it is
set.

> +        saferemove_stepsize => {
> +            description => "Wipe step size (default 32MB).",
> +            enum => [qw(1 2 4 8 16 32)],
> +        },
> +        saferemove_discard => {
> +            description => "Wipe with discard instead zeroing.",
> +            type => 'boolean',

Not sure we need this (not sure when this is actually useful), but it's
cheap enough to have around. Should add a `default => 0` for
documentation purposes, though.

>          },
>          tagged_only => {
>              description => "Only use logical volumes tagged with 
> 'pve-vm-ID'.",
> @@ -394,7 +378,8 @@ sub options {
>          shared => { optional => 1 },
>          disable => { optional => 1 },
>          saferemove => { optional => 1 },
> -        saferemove_throughput => { optional => 1 },
> +        saferemove_discard => { optional => 1 },
> +        saferemove_stepsize => { optional => 1 },
>          content => { optional => 1 },
>          base => { fixed => 1, optional => 1 },
>          tagged_only => { optional => 1 },
> -- 
> 2.47.2


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to