Hi,
we had bad blocks on one OSD and around the same time a network switch
outage, which seems to have caused some corruption on the mon service.
> # ceph -s
cluster:
id: d7c5c9c7-a227-4e33-ab43-3f4aa1eb0630
health: HEALTH_WARN
1 daemons have recently crashed
14097 slow ops, oldest one blocked for 56417 sec,
mon.server6 has slow ops
mon server6 is low on available space
services:
mon: 3 daemons, quorum server6,server3,server5 (age 15h)
mgr: server4(active, since 3w), standbys: server6, server5
mds: xpool:1 {0=server6=up:active} 1 up:standby
osd: 21 osds: 21 up (since 15h), 20 in (since 16h)
data:
pools: 17 pools, 941 pgs
objects: 6.80M objects, 18 TiB
usage: 34 TiB used, 20 TiB / 54 TiB avail
pgs: 940 active+clean
1 active+clean+scrubbing+deep
io:
client: 23 MiB/s rd, 980 KiB/s wr, 30 op/s rd, 141 op/s wr
14097 slow ops, oldest one blocked for 56417 sec, mon.server6 has slow ops
The mon ops log looks like:
https://gist.github.com/poelzi/45f31f26f6a83f6406bb43553e0c237a
It seems, that the mds transactions don't finish, while waiting for
mdsmap. In the mds server, there are no ops in flight, nor any errors in
the log file.
What is the proper way to repair this ?
kind regards
poelzi
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]