Hi Venky,
thank you for your help. We managed to shut down mds.1:
We set "ceph fs set max_mds 1" and waited for about 30 minutes. In the first
couple minutes, strays were migrated from mds.1 to mds.0. After this, the stray
export hung. The mds.1 remained in the state_stopping. After about 30 minutes,
we restarted mds.1. This resulted in one active mds and two standby mds.
However, we are not sure, if the remaining strays could be migrated.
When we had a closer look at the perf counter of the mds, we realized that the
number of strays_enqueued is quite high and constantly increasing. Is this to
be expected? What does the counter "strays_enqueued" mean in detail?
ceph daemon mds.0 perf dump | grep stray
"num_strays": 49846,
"num_strays_delayed": 21,
"num_strays_enqueuing": 0,
"strays_created": 2042124,
"strays_enqueued": 2396076,
"strays_reintegrated": 44207,
"strays_migrated": 38,
Would it be safe to perform "ceph orch upgrade resume" at this point? At the
moment, the MONs and OSDs are running 17.2.6, while the MDSs and RGWs are
running 17.2.5. So we have to upgrade the MDS and RGW eventually.
Best, Tobias
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]