Quick answers:

  *   ... osd_deep_scrub_randomize_ratio ... but not on Octopus: is it still a 
valid parameter?

Yes, this parameter exists and can be used to prevent premature deep-scrubs. 
The effect is dramatic.


  *   ... essentially by playing with osd_scrub_min_interval,...

The main parameter is actually osd_deep_scrub_randomize_ratio, all other 
parameters have less effect in terms of scrub load. osd_scrub_min_interval is 
the second most important parameter and needs increasing for large 
SATA-/NL-SAS-HDDs. For sufficiently fast drives the default of 24h is good 
(although might be a bit aggressive/paranoid).


  *   Another small question: you opt for osd_max_scrubs=1 just to make sure
your I/O is not adversely affected by scrubbing, or is there a more
profound reason for that?

Well, not affecting user-IO too much is a quite profound reason and many admins 
try to avoid scrubbing at all when users are on the system. It makes IO 
somewhat unpredictable and can trigger user complaints.

However, there is another profound reason: for HDDs it increases deep-scrub 
load (that is, interference with user IO) a lot while it actually slows down 
the deep-scrubbing. HDDs can't handle the implied random IO of concurrent 
deep-scrubs well. On my system I saw that with osd_max_scrubs=2 the scrub time 
for a PG increased a bit more than double. In other words: more scrub load, 
less scrub progress = useless, do not do this.

I plan to document the script a bit more and am waiting for some deep-scrub 
histograms to converge to equilibrium. This takes months for our large pools, 
but I would like to have the numbers for an example of how it should look like.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________
From: Fulvio Galeazzi
Sent: Monday, January 8, 2024 4:21 PM
To: Frank Schilder; ceph-users@ceph.io
Subject: Re: [ceph-users] Re: How to configure something like 
osd_deep_scrub_min_interval?

Hallo Frank,
        just found this post, thank you! I have also been puzzled/struggling
with scrub/deep-scrub and found your post very useful: will give this a
try, soon.

One thing, first: I am using Octopus, too, but I cannot find any
documentation about osd_deep_scrub_randomize_ratio. I do see that in
past releases, but not on Octopus: is it still a valid parameter?

Let me check whether I understood your procedure: you optimize scrub
time distribution essentially by playing with osd_scrub_min_interval,
thus "forcing" the automated algorithm to preferentially select
older-scrubbed PGs, am I correct?

Another small question: you opt for osd_max_scrubs=1 just to make sure
your I/O is not adversely affected by scrubbing, or is there a more
profound reason for that?

   Thanks!

                Fulvio

On 12/13/23 13:36, Frank Schilder wrote:
> Hi all,
>
> since there seems to be some interest, here some additional notes.
>
> 1) The script is tested on octopus. It seems that there was a change in the 
> output of ceph commands used and it might need some tweaking to get it to 
> work on other versions.
>
> 2) If you want to give my findings a shot, you can do so in a gradual way. 
> The most important change is setting osd_deep_scrub_randomize_ratio=0 (with 
> osd_max_scrubs=1), this will make osd_deep_scrub_interval work exactly as the 
> requested osd_deep_scrub_min_interval setting, PGs with a deep-scrub stamp 
> younger than osd_deep_scrub_interval will *not* be deep-scrubbed. This is the 
> one change to test, all other settings have less impact. The script will not 
> report some numbers at the end, but the histogram will be correct. Let it run 
> a few deep-scrub-interval rounds until the histogram is evened out.
>
> If you start your test after using osd_max_scrubs>1 for a while -as I did - 
> you will need a lot of patience and might need to mute some scrub warnings 
> for a while.
>
> 3) The changes are mostly relevant for large HDDs that take a long time to 
> deep-scrub (many small objects). The overall load reduction, however, is 
> useful in general.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

--
Fulvio Galeazzi
GARR-Net Department
tel.: +39-334-6533-250
skype: fgaleazzi70
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to