I recently had a similar issue on one of my clusters that might be related.  I 
found that when new OSDs were added to the cluster they were taking a long time 
to start.  This ended up being caused by the new OSDs needing to pull down a 
couple hundred thousand osdmaps from the mon nodes.  To see if you're also 
affected by this, try running the following command:

ceph report 2>/dev/null | jq '(.osdmap_last_committed - 
.osdmap_first_committed)'

This number should be between 500-1000 on a healthy cluster.  I've seen this as 
high as 4.8 million before (roughly 50% of the data stored on the cluster ended 
up being osdmaps!)

If you're curious how large a single osdmap is, you can run this command to 
save the current osdmap to a file:

ceph osd getmap -o [filename]

This appears to be a bug that should be fixed in the latest releases of Ceph 
(Quincy 17.2.8 & Reef 18.2.4) based on this report:

https://tracker.ceph.com/issues/63883

In the meantime, if you are seeing a large difference between the first and 
last committed osdmaps you can usually clear that up by restarting each of the 
mon daemons sequentially, starting with the primary and waiting for it to 
finish peering before moving on to the next one.  Doing this on the first 
cluster I mentioned above reduced the startup time of the new OSDs to seconds 
from 10-20 minutes each!

Bryan

From: Gregory Orange <gregory.ora...@pawsey.org.au>
Date: Thursday, January 23, 2025 at 01:52
To: ceph-users@ceph.io <ceph-users@ceph.io>
Subject: [ceph-users] Re: Slow initial boot of OSDs in large cluster with 
unclean state
Sometimes starting an OSD can take up to 20 minutes, so there may be
some shared experience there. However, apart from a harrowing period
last year[1] we live in HEALTH_OK most of the time.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to