Hi Chris,
While we look into this, I have a couple of questions:
1. Did the recovery rate stay at 1 object/sec throughout? In our tests we
have seen that
the rate is higher during the starting phase of recovery and eventually
tapers off due
to throttling by mclock.
2. Can you try speeding up the recovery by changing to "high_recovery_ops"
profile on
all the OSDs to see if it improves things (both CPU load and recovery
rate)?
3. On the OSDs that showed high CPU usage, can you run the following
command and
revert back? This just dumps the mclock settings on the OSDs.
sudo ceph daemon osd.N config show | grep osd_mclock
I will update the tracker with these questions as well so that the
discussion can
continue there.
Thanks,
-Sridhar
On Tue, Jul 12, 2022 at 4:49 PM Chris Palmer <[email protected]> wrote:
> I've created tracker https://tracker.ceph.com/issues/56530 for this,
> including info on replicating it on another cluster.
>
>
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]