[ceph-users] ceph progress bar stuck and 3rd manager not deploying

mabi Tue, 07 Sep 2021 09:31:50 -0700

Hello

I have a test ceph octopus 16.2.5 cluster with cephadm out of 7 nodes on Ubuntu 
20.04 LTS bare metal. I just upgraded each node's kernel and performed a 
rolling reboot and now the ceph -s output is stuck somehow and the manager 
service is only deployed to two nodes instead of 3 nodes. Here would be the 
ceph -s output:


  cluster:
    id:     fb48d256-f43d-11eb-9f74-7fd39d4b232a
    health: HEALTH_WARN
            OSD count 1 < osd_pool_default_size 3

  services:
    mon: 2 daemons, quorum ceph1a,ceph1c (age 25m)
    mgr: ceph1a.guidwn(active, since 25m), standbys: ceph1c.bttxuu
    osd: 1 osds: 1 up (since 30m), 1 in (since 3w)

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   5.3 MiB used, 7.0 TiB / 7.0 TiB avail
    pgs:

  progress:
    Updating crash deployment (-1 -> 6) (0s)
      [............................]

Ignore the HEALTH_WARN with of the OSD count because I have not finished to 
deploy all 3 OSDs. But you can see that the progress bar is stuck and I have 
only 2 managers, the third manager does not seem to start as can be seen here:

$ ceph orch ps|grep stopped
mon.ceph1b            ceph1b               stopped           4m ago   4w        
-    2048M  <unknown>  <unknown>     <unknown>

It looks like the orchestrator is stuck and does not continue it's job. Any 
idea how I can get it unstuck?

Best regards,
Mabi

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] ceph progress bar stuck and 3rd manager not deploying

Reply via email to