Something else to experiment with is running `fstrim` every weekend on RBD 
clients for a more-incremental freeing of deleted data.


> On Apr 28, 2025, at 2:51 AM, Dominique Ramaekers 
> <dominique.ramaek...@cometal.be> wrote:
> 
> Swap isn't enabled.
> 
> I can't reproduce it anymore. This behavior only showed with such extreme on 
> the images of one virtual machine. I don't know where the history of these 
> images was different from the other images that can provoke this behavior...
> 
> Also @Eugen and @Anthony... Thanks for the input. I'll be more attentive next 
> time I'm cleaning up my images... Maybe next time I can catch some more 
> relevant information.
> 
>> -----Oorspronkelijk bericht-----
>> Van: Frédéric Nass <frederic.n...@univ-lorraine.fr>
>> Verzonden: vrijdag 25 april 2025 17:54
>> Aan: Dominique Ramaekers <dominique.ramaek...@cometal.be>
>> CC: ceph-users <ceph-users@ceph.io>
>> Onderwerp: Re: [ceph-users] Memory usage
>> 
>> Hi Dominique,
>> 
>> Is swap enabled on OSD nodes?
>> 
>> My experience is that when swap is enabled on OSD nodes, OSDs can exceed
>> osd_memory_target in a completely unreasonable manner compared to
>> when swap is disabled. Using swap on OSD nodes has been discouraged in
>> the past [1] but I think it's not clearly documented enough.
>> 
>> As to understand why these OSDs use so much RAM, read Mark's message
>> from April 11th on this list. It will help you diagnose ram usage.
>> 
>> Regards,
>> Frédéric.
>> 
>> [1] https://docs.ceph.com/en/latest/releases/nautilus/#notable-changes
>> 
>> ----- Le 25 Avr 25, à 16:36, Dominique Ramaekers
>> dominique.ramaek...@cometal.be a écrit :
>> 
>>> Hi,
>>> 
>>> Housekeeping... I was cleaning up my snapshots and was flattening
>> clones...
>>> Suddenly I ran out of memory on my nodes!
>>> 
>>> 4 node cluster with each 10 ssd osd's with total storage size 25TiB.
>>> Each node has about 45 GiB of free (available) memory in normal
>> operation.
>>> 
>>> After flattening several images and removing dozens of snapshots, the
>>> free memory probably was already lower than the usual 45GiB. I was
>>> flattening an image of 75GiB and 2 out of 4 nodes ran out of memory.
>>> One node even automatically killed processes randomly to free up
>>> memory... After regaining control over the system, I put that node in
>> maintenance and rebooted this node.
>>> After reboot the free mem was like 70Gib. Over night all of the nodes
>>> were back on the usual 45GiB.
>>> 
>>> Today I checked the free mem of each node: 45GiB free. So I flattened
>>> again an image of 75GiB. And yes, the free mem dropped from 45GiB to
>> 5GiB really fast!
>>> 
>>> Is there a way to avoid this behavior of the cluster?
>>> 
>>> Greetings,
>>> 
>>> Dominique.
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
>>> email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to