Hi,
This cluster is deployed by cephadm 17.2.5,containerized.
It ends up in this(no active mgr):
[root@8cd2c0657c77 /]# ceph -s
cluster:
id: ad3a132e-e9ee-11ed-8a19-043f72fb8bf9
health: HEALTH_WARN
6 hosts fail cephadm check
no active mgr
1/3 mons down, quorum h18w,h19w
Degraded data redundancy: 781908/2345724 objects degraded
(33.333%), 101 pgs degraded, 209 pgs undersized
services:
mon: 3 daemons, quorum h18w,h19w (age 19m), out of quorum: h15w
mgr: no daemons active (since 5h)
mds: 1/1 daemons up, 1 standby
osd: 9 osds: 6 up (since 5h), 6 in (since 5h)
rgw: 2 daemons active (2 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 8 pools, 209 pgs
objects: 781.91k objects, 152 GiB
usage: 312 GiB used, 54 TiB / 55 TiB avail
pgs: 781908/2345724 objects degraded (33.333%)
108 active+undersized
101 active+undersized+degraded
I checked the h20w, there is a manager container running with log:
debug 2023-05-10T12:43:23.315+0000 7f5e152ec000 0 monclient(hunting):
authenticate timed out after 300
debug 2023-05-10T12:48:23.318+0000 7f5e152ec000 0 monclient(hunting):
authenticate timed out after 300
debug 2023-05-10T12:53:23.318+0000 7f5e152ec000 0 monclient(hunting):
authenticate timed out after 300
debug 2023-05-10T12:58:23.319+0000 7f5e152ec000 0 monclient(hunting):
authenticate timed out after 300
debug 2023-05-10T13:03:23.319+0000 7f5e152ec000 0 monclient(hunting):
authenticate timed out after 300
debug 2023-05-10T13:08:23.319+0000 7f5e152ec000 0 monclient(hunting):
authenticate timed out after 300
debug 2023-05-10T13:13:23.319+0000 7f5e152ec000 0 monclient(hunting):
authenticate timed out after 300
any idea to get a mgr up running again through cephadm?
Thanks,
Ben
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]