Hello
I have a test ceph octopus 16.2.5 cluster with cephadm out of 7 nodes on Ubuntu
20.04 LTS bare metal. I just upgraded each node's kernel and performed a
rolling reboot and now the ceph -s output is stuck somehow and the manager
service is only deployed to two nodes instead of 3 nodes. Here would be the
ceph -s output:
cluster:
id: fb48d256-f43d-11eb-9f74-7fd39d4b232a
health: HEALTH_WARN
OSD count 1 < osd_pool_default_size 3
services:
mon: 2 daemons, quorum ceph1a,ceph1c (age 25m)
mgr: ceph1a.guidwn(active, since 25m), standbys: ceph1c.bttxuu
osd: 1 osds: 1 up (since 30m), 1 in (since 3w)
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 5.3 MiB used, 7.0 TiB / 7.0 TiB avail
pgs:
progress:
Updating crash deployment (-1 -> 6) (0s)
[............................]
Ignore the HEALTH_WARN with of the OSD count because I have not finished to
deploy all 3 OSDs. But you can see that the progress bar is stuck and I have
only 2 managers, the third manager does not seem to start as can be seen here:
$ ceph orch ps|grep stopped
mon.ceph1b ceph1b stopped 4m ago 4w
- 2048M <unknown> <unknown> <unknown>
It looks like the orchestrator is stuck and does not continue it's job. Any
idea how I can get it unstuck?
Best regards,
Mabi
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]