[ceph-users] Re: squid 19.2.2 - osd_memory_target_autotune - best practices when host has lots of RAM

Steven Vacaroaia Fri, 01 Aug 2025 07:17:57 -0700

Excellent info Anthony
Many thanks
Steven

On Fri, 1 Aug 2025 at 09:29, Anthony D'Atri <a...@dreamsnake.net> wrote:


>
> The servers are dedicated to Ceph
> Yes, it is perhaps too much but my IT philosophy is "there is always room
> for more RAM" as it usually helps running things faster
>
>
> Unless you're a certain Sun model, but I digress...
>
> The $ spent on all that RAM would IMHO have been more effective choosing
> NVMe SSDs instead of HDDs.  And not having to pay for the HBA.
>
>
> Now, since I have it, I would like to use it as efficiently as possible
>
>
> That's what the autotuner is all about.
>
>
> The 3 NVMEs are 15TB dedicated to OSD - there are 2 more 1.6TB dedicated
> to DB/WAL
> HDD are 20TB  and SSD are 7TB
>
> Is my understanding correct that autotune will dedicated 70% to OSDs
> indiscriminately ???
> ... or there is some sort of algorithm for differentiating between the
> disk type and size ?
>
>
> There is not as far as I know, but honestly you're dramatically into the
> territory where you have so much that there wouldn't be much to be gained
> by customizing.  Diminishing returns.
>
>
> If NVME is SSD
>
>
> NVMe is an interface not a medium.  A SATA SSD and an NVMe SSD are the
> same NAND (at similar cost) with a different interface.
>
> from autotune perspective, it would probably make sense to tune it
> manually , no ?
>
>
> If you like.  See the below page for setting manually on a host-by-host or
> per-OSD basis. Or disable it and do the math to divide it by device class
> as you like, though note that if you add or remove OSDs from a given host
> the system won't adjust without intervention.  Or if you add systems less
> expensively with less RAM and move some from these to them to even it out.
>
> Since you don't have much else contending for that RAM, you might
>
> ceph config set mgr/cephadm/autotune_memory_target_ratio   0.930000
>
> which will let the autotuner use even more for OSDs.
>
>
> How would I check status of autotune ...other than checking individual OSD
> config ?
>
>
> # ceph config dump | grep osd_memory_target
> osd                                       host:cephab92  basic
> osd_memory_target                          12083522051
> osd                                       host:cephac0f  basic
> osd_memory_target                          12083519715
> osd                                       host:dd13-25   basic
> osd_memory_target                          6156072793
> osd                                       host:dd13-29   basic
> osd_memory_target                          6235526712
> osd                                       host:dd13-33   basic
> osd_memory_target                          6780274813
> osd                                       host:dd13-37   basic
> osd_memory_target                          6601357610
> osd                                       host:i18-24    basic
> osd_memory_target                          6670077087
> osd                                       host:i18-28    basic
> osd_memory_target                          6600879861
> osd                                       host:k18-23    basic
> osd_memory_target                          6663117330
> osd                                       host:l18-24    basic
> osd_memory_target                          6822190406
> osd                                       host:l18-28    basic
> osd_memory_target                          6782421978
> osd                                       host:m18-33    basic
> osd_memory_target                          6593523272
> osd                                                               advanced
>  osd_memory_target_autotune                 true
>
> Here the first two hosts have much more RAM than the others, so the
> autotuner has more to distribute.
>
> You can game the autotuner in various ways, see
> https://www.ibm.com/docs/en/storage-ceph/8.0.0?topic=osds-automatically-tuning-osd-memory
>
>
> https://docs.ceph.com/en/latest/rados/configuration/ceph-conf/#sections-and-masks
>
> Many thanks
>
> Steven
>
> On Thu, 31 Jul 2025 at 10:43, Anthony D'Atri <a...@dreamsnake.net> wrote:
>
>> IMHO the autotuner is awesome.
>>
>> 1TB of RAM is an embarrassment of riches -- are these hosts perhaps
>> converged compute+storage?
>>
>>
>>
>> > On Jul 31, 2025, at 10:17 AM, Steven Vacaroaia <ste...@gmail.com>
>> wrote:
>> >
>> > Hi
>> >
>> > What is the best practice / your expert advice about using
>> > osd_memory_target_autotune
>> > on hosts with lots of RAM  ?
>> >
>> > My hosts have 1 TB RAM , only 3 NVMEs , 12 HDD and 12 SSD
>>
>> Remember that NVMe devices *are* SSDs ;)  I'm guessing those are used for
>> WAL+DB offload, and thus you have 24x OSDs per host?
>>
>> > Should I disable autotune and allocate more RAM?
>>
>> The autotuner by default will divide 70% of physmem across all the OSDs
>> it finds on a given host, with 30% allocated for the OS and other daemons.
>> I *think* any RGWs, mons, etc. are assumed to be part of that 30% but am
>> not positive.
>>
>> >
>> > I saw some suggestion for 16GB to NVME , 8GB to SSD and 6 to HDD
>>
>> I personally have a growing sense that more RAM actually can help slower
>> OSDs more, at least with respect to rebalancing without rampant slow ops.
>> ymmv.
>>
>> This implies that your NVMe devices are standalone OSDs, so that would
>> mean 27 OSDs per node?  I'm curious what manner of chassis this is.
>>
>> I then would think that the autotuner would set ~~ 26TB to
>> osd_memory_size, which is ample by any measure.  ~307TB will be available
>> for non-OSD processes.
>>
>>
>> If you're running compute or other significant non-Ceph workloads on the
>> same nodes, you can adjust the reservation factor by setting ceph config
>> set mgr mgr/cephadm/autotune_memory_target_ratio xxx.  So if you want to
>> reserve less for non-OSD processes, something like
>>
>> ceph config set mgr mgr/cephadm/autotune_memory_target_ratio 0.1
>>
>> If yo do have hungry compute colocated, a good value might be something
>> like 0.25, which would give each OSD > 9GB for osd_memory_target.  If you
>> do want to allot different amounts to different device classes, you can
>> instead set static values, using central config device class masks.
>>
>>
>>
>> >
>> > Many thanks
>> > Steven
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: squid 19.2.2 - osd_memory_target_autotune - best practices when host has lots of RAM

Reply via email to