Hi Alwin,

All VM disks are set as SSD with discard enabled.

“Run guest-trim after a disk move or VM migration” was not checked.  I didn’t 
realize that was in there!  I don’t think that applies though? Per 
https://pve.proxmox.com/pve-docs/chapter-qm.html#qm_qemu_agent :
“With this enabled, Proxmox VE will issue a trim command to the guest after the 
following operations that have the potential to write out zeros to the storage:

  *   moving a disk to another storage
  *   live migrating a VM to another node with local storage”

…and using Ceph the disk doesn’t move if we migrate a VM…?  Plus trim is being 
run weekly.  Maybe I could stop a VM and move its disk to local storage and 
back into Ceph again but that would take quite a lot of time and seems like 
little to no gain except maybe a lower object count.

That section also has this note on ext4, matching previous discussions: “There 
is a caveat with ext4 on Linux, because it uses an in-memory optimization to 
avoid issuing duplicate TRIM requests. Since the guest doesn’t know about the 
change in the underlying storage, only the first guest-trim will run as 
expected. Subsequent ones, until the next reboot, will only consider parts of 
the filesystem that changed since then.”



From: Alwin Antreich <[email protected]>
Sent: Thursday, March 12, 2026 12:53 PM
To: Steve Yates <[email protected]>
Cc: [email protected]
Subject: Re: [ceph-users] Re: Can rbd sparsify be run on a running VM's disk?

Hi Steve,

On Wed, 11 Mar 2026 at 15:26, Steve Yates 
<[email protected]<mailto:[email protected]>> wrote:
Hi Alwin,

Thanks for the input.  Sounds like "stopped" is the safe way, which is what I'd 
assumed.  It would be good if that was clarified in the Ceph docs 
(https://docs.ceph.com/en/latest/man/8/rbd/#:~:text=sparsify).

I found in the Proxmox forum thread that if I create 1+10 GB in two zero files 
on a small VM, delete, and trim, "rbd du" USED remained at 37/43 GB for that 
VM, so that doesn't help.  Also same issue on Windows VMs, and we have at least 
one (a pfSense VM) using ZFS, so it's not just EXT4.
AFAIK, in Proxmox VE, QEMU converts zero-writes to discards because 
"detect-zeroes":"unmap" is set on the disk drives. This suggests that the disks 
are highly fragmented and the object count doesn't change significantly. A 
similar thing may apply to ZFS as well.

Sparsify discards an object when it consists entirely of zeros, at the object 
level. But an fstrim might not trim the entire range of an object because it 
operates at the block level. This potentially leaves empty objects behind, 
which sparsify will pick up.

You could try a storage migration, which might perform a similar function to 
sparsify and could be done online.


Re: the serverfault thread, rebooting a VM results in no change in "rbd du" 
USED.  We do regularly reboot VMs for updates, and migrate too when rebooting 
PVE nodes.
Some thoughts about the regular reboot. You've likely thought about this, but 
eh. :)
 * Is the discard flag set on the disks?
 * Is guest-trim enabled for qemu-guest-agent in the VM's options tab after 
migration/cloning?
 * Is the "SSD emulation" flag set on the disk tab of the VM? Especially for 
Windows, as it will switch from defragmentation to TRIM (amongst other things).


I'm about at the point of ignoring the "rbd du" USED number.  It doesn't seem 
terribly meaningful.
Yeah, it's an indicator but not really something to worry about unless you're 
heavily overprovisioning your pools.

Cheers,
Alwin

--
Alwin Antreich
Head of Training and Proxmox Services

Want to meet: https://calendar.app.google/MuA2isCGnh8xBb657

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges, Andy Muthmann - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web<https://croit.io/> | LinkedIn<http://linkedin.com/company/croit> | 
Youtube<https://www.youtube.com/channel/UCIJJSKVdcSLGLBtwSFx_epw> | 
Twitter<https://twitter.com/croit_io>

TOP 100 Innovator Award<https://croit.io/blog/croit-receives-top-100-seal> 
Winner by compamedia
Technology Fast50 
Award<https://croit.io/blog/deloitte-technology-fast-50-award> Winner by 
Deloitte
________________________________

This email has been scanned for spam & viruses. If you believe this email 
should have been stopped by our filters, click 
here<https://portal.mailanyone.net/index.html#/outer/reportspam?token=dXNlcj1zdGV2ZUB0ZWFtaXRzLmNvbTt0cz0xNzczMzM4MDAwO3V1aWQ9NjlCMkZEOEU2RDE2MUQ1MjBBN0ZCNkY3REVFMkUzMUI7dG9rZW49NjgyNWUzNWIwNjA3N2U0ZjBlMDg5NWRlYmZmZmU2ZWY2OTAyY2Q5MTs%3D>
 to report it.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to