I feel like this has been discussed multiple times on this list, I
just don't have any links at hand. I suspect mclock settings, most
likely some things changed between Quincy and Reef. You could fall
back to wpq instead of mclock, it's still a general recommendation at
the moment, and then see if anyting improves.
Zitat von Reed Dier via ceph-users <[email protected]>:
Hello all,
TL;DR is that the number of concurrent PGs scrubbing, deep or
otherwise, has appeared to increase by about 5-10x, while the number
of PGs complaining that they haven't been scrubbed, deep or
otherwise, has continued to tick higher.
HEALTH_WARN 217 pgs not deep-scrubbed in time; 187 pgs not scrubbed in time
Hoping there may be something that I must have missed in release
notes and mailing list that explains why my scrubs both exploded in
concurrency, as well as fell behind after upgrading from quincy
(17.2.9) to reef (18.2.8).
Non-cephadm, U22.04, rather heterogenous OSD hardware.
Mix of 8T and 2T HDD, as well as 2T SSD.
HDD's have NVMe WAL/DB of various sizes depending on when they were deployed.
Mix of replicated and EC pools, as well as some replicated pools
across different device classes.
The vast majority of the PG's that are behind on scrubbing are on EC
pools, and the vast majority of that, is our EC82 cephfs pool (40)
that holds the bulk of our stored data, and the other largest pool
is an older EC73 cephfs pool (37).
My quick and dirty approximation based on PGs last scrubbed last month.
ceph pg dump | grep 2026-05 | awk '{print $1" "$27}' | grep -v
periodic | cut -d '.' -f1 | sort | uniq -c
dumped all
1 17
1 20
116 37
224 40
I didn't make any changes to scrub intervals or mclock profiles
before/during/after the upgrade.
ceph config dump | grep mclock_profile | awk '{print $4}' | uniq -c ;
313 balanced
ceph config dump | grep scrub_interval
global class:ssd advanced osd_deep_scrub_interval
604800.000000
mon advanced osd_deep_scrub_interval
604800.000000
mon.* advanced osd_deep_scrub_interval
604800.000000
mgr.* advanced osd_deep_scrub_interval
604800.000000
osd class:hdd advanced osd_deep_scrub_interval
604800.000000
osd class:ssd advanced osd_deep_scrub_interval
604800.000000
osd advanced osd_deep_scrub_interval
604800.000000
osd.* advanced osd_deep_scrub_interval
604800.000000
I've tried ceph tell osd.$osd osd_max_scrubs $more, which seems to
somewhat momentarily drive the count of
active+clean+scrubbing[+deep] PGs, but doesn't seem to make a
demonstrative difference in terms of getting ahead in the number of
PGs behind (number continues to grow).
I also looked at load15 across OSD hosts, and they don't appear to
be anywhere near the 50% threshold of osd_scrub_load_threshold
either, so I think I can rule that one out for now.
I'm mostly curious why the change in behavior of concurrent scrubs
ballooning, and yet the number of PGs behind on scrubbing ballooning
as well, without anything actually changing.
And I'm also curious what tunables I can turn to get things back
under control for scrubbing both short and long term as I look
towards getting to squid and 24.04.
Is there an internal mechanism that triggers a deeper scrub during
first deep scrub after upgrading a major release, reef or otherwise?
Included some graphs of scrub load over the last 60 and 365 day
period to show prior scrub load that only exceedingly rarely ever
generated a PG_NOT_[DEEP_]SCRUBBED warning,
as well as raw load average (smallest cpu count is 16, and it
doesn't even autoscale to 8, so nothing should be complaining there.)
https://imgur.com/a/rixNrCe
Appreciate any pointers anyone can steer me towards.
Reed
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]