Something else to experiment with is running `fstrim` every weekend on RBD clients for a more-incremental freeing of deleted data.
> On Apr 28, 2025, at 2:51 AM, Dominique Ramaekers > <dominique.ramaek...@cometal.be> wrote: > > Swap isn't enabled. > > I can't reproduce it anymore. This behavior only showed with such extreme on > the images of one virtual machine. I don't know where the history of these > images was different from the other images that can provoke this behavior... > > Also @Eugen and @Anthony... Thanks for the input. I'll be more attentive next > time I'm cleaning up my images... Maybe next time I can catch some more > relevant information. > >> -----Oorspronkelijk bericht----- >> Van: Frédéric Nass <frederic.n...@univ-lorraine.fr> >> Verzonden: vrijdag 25 april 2025 17:54 >> Aan: Dominique Ramaekers <dominique.ramaek...@cometal.be> >> CC: ceph-users <ceph-users@ceph.io> >> Onderwerp: Re: [ceph-users] Memory usage >> >> Hi Dominique, >> >> Is swap enabled on OSD nodes? >> >> My experience is that when swap is enabled on OSD nodes, OSDs can exceed >> osd_memory_target in a completely unreasonable manner compared to >> when swap is disabled. Using swap on OSD nodes has been discouraged in >> the past [1] but I think it's not clearly documented enough. >> >> As to understand why these OSDs use so much RAM, read Mark's message >> from April 11th on this list. It will help you diagnose ram usage. >> >> Regards, >> Frédéric. >> >> [1] https://docs.ceph.com/en/latest/releases/nautilus/#notable-changes >> >> ----- Le 25 Avr 25, à 16:36, Dominique Ramaekers >> dominique.ramaek...@cometal.be a écrit : >> >>> Hi, >>> >>> Housekeeping... I was cleaning up my snapshots and was flattening >> clones... >>> Suddenly I ran out of memory on my nodes! >>> >>> 4 node cluster with each 10 ssd osd's with total storage size 25TiB. >>> Each node has about 45 GiB of free (available) memory in normal >> operation. >>> >>> After flattening several images and removing dozens of snapshots, the >>> free memory probably was already lower than the usual 45GiB. I was >>> flattening an image of 75GiB and 2 out of 4 nodes ran out of memory. >>> One node even automatically killed processes randomly to free up >>> memory... After regaining control over the system, I put that node in >> maintenance and rebooted this node. >>> After reboot the free mem was like 70Gib. Over night all of the nodes >>> were back on the usual 45GiB. >>> >>> Today I checked the free mem of each node: 45GiB free. So I flattened >>> again an image of 75GiB. And yes, the free mem dropped from 45GiB to >> 5GiB really fast! >>> >>> Is there a way to avoid this behavior of the cluster? >>> >>> Greetings, >>> >>> Dominique. >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an >>> email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io