Good morning everyone!
Today there was an atypical situation in our Cluster where the three
machines came to shut down.
On powering up the cluster went up and formed quorum with no problems, but
the PGs are all in Working, I don't see any disk activity on the machines.
No PG is active.
[ceph: root@dcs1 /]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 98.24359 root default
-3 32.74786 host dcs1
0 hdd 2.72899 osd.0 up 1.00000 1.00000
1 hdd 2.72899 osd.1 up 1.00000 1.00000
2 hdd 2.72899 osd.2 up 1.00000 1.00000
3 hdd 2.72899 osd.3 up 1.00000 1.00000
4 hdd 2.72899 osd.4 up 1.00000 1.00000
5 hdd 2.72899 osd.5 up 1.00000 1.00000
6 hdd 2.72899 osd.6 up 1.00000 1.00000
7 hdd 2.72899 osd.7 up 1.00000 1.00000
8 hdd 2.72899 osd.8 up 1.00000 1.00000
9 hdd 2.72899 osd.9 up 1.00000 1.00000
10 hdd 2.72899 osd.10 up 1.00000 1.00000
11 hdd 2.72899 osd.11 up 1.00000 1.00000
-5 32.74786 host dcs2
12 hdd 2.72899 osd.12 up 1.00000 1.00000
13 hdd 2.72899 osd.13 up 1.00000 1.00000
14 hdd 2.72899 osd.14 up 1.00000 1.00000
15 hdd 2.72899 osd.15 up 1.00000 1.00000
16 hdd 2.72899 osd.16 up 1.00000 1.00000
17 hdd 2.72899 osd.17 up 1.00000 1.00000
18 hdd 2.72899 osd.18 up 1.00000 1.00000
19 hdd 2.72899 osd.19 up 1.00000 1.00000
20 hdd 2.72899 osd.20 up 1.00000 1.00000
21 hdd 2.72899 osd.21 up 1.00000 1.00000
22 hdd 2.72899 osd.22 up 1.00000 1.00000
23 hdd 2.72899 osd.23 up 1.00000 1.00000
-7 32.74786 host dcs3
24 hdd 2.72899 osd.24 up 1.00000 1.00000
25 hdd 2.72899 osd.25 up 1.00000 1.00000
26 hdd 2.72899 osd.26 up 1.00000 1.00000
27 hdd 2.72899 osd.27 up 1.00000 1.00000
28 hdd 2.72899 osd.28 up 1.00000 1.00000
29 hdd 2.72899 osd.29 up 1.00000 1.00000
30 hdd 2.72899 osd.30 up 1.00000 1.00000
31 hdd 2.72899 osd.31 up 1.00000 1.00000
32 hdd 2.72899 osd.32 up 1.00000 1.00000
33 hdd 2.72899 osd.33 up 1.00000 1.00000
34 hdd 2.72899 osd.34 up 1.00000 1.00000
35 hdd 2.72899 osd.35 up 1.00000 1.00000
[ceph: root@dcs1 /]# ceph -s
cluster:
id: 58bbb950-538b-11ed-b237-2c59e53b80cc
health: HEALTH_WARN
4 filesystems are degraded
4 MDSs report slow metadata IOs
Reduced data availability: 1153 pgs inactive, 1101 pgs peering
26 slow ops, oldest one blocked for 563 sec, daemons
[osd.10,osd.13,osd.14,osd.15,osd.16,osd.18,osd.20,osd.21,osd.24,osd.25]...
have slow ops.
services:
mon: 3 daemons, quorum dcs1.evocorp,dcs2,dcs3 (age 7m)
mgr: dcs1.evocorp.kyqfcd(active, since 15m), standbys: dcs2.rirtyl
mds: 4/4 daemons up, 4 standby
osd: 36 osds: 36 up (since 6m), 36 in (since 47m); 65 remapped pgs
data:
volumes: 0/4 healthy, 4 recovering
pools: 10 pools, 1153 pgs
objects: 254.72k objects, 994 GiB
usage: 2.8 TiB used, 95 TiB / 98 TiB avail
pgs: 100.000% pgs not active
1036 peering
65 remapped+peering
52 activating
[ceph: root@dcs1 /]# ceph health detail
HEALTH_WARN 4 filesystems are degraded; 4 MDSs report slow metadata IOs;
Reduced data availability: 1153 pgs inactive, 1101 pgs peering; 26 slow
ops, oldest one blocked for 673 sec, daemons
[osd.10,osd.13,osd.14,osd.15,osd.16,osd.18,osd.20,osd.21,osd.24,osd.25]...
have slow ops.
[WRN] FS_DEGRADED: 4 filesystems are degraded
fs dc_ovirt is degraded
fs dc_iso is degraded
fs dc_sas is degraded
fs pool_tester is degraded
[WRN] MDS_SLOW_METADATA_IO: 4 MDSs report slow metadata IOs
mds.dc_sas.dcs1.wbyuik(mds.0): 4 slow metadata IOs are blocked > 30
secs, oldest blocked for 1063 secs
mds.dc_ovirt.dcs1.lpcazs(mds.0): 4 slow metadata IOs are blocked > 30
secs, oldest blocked for 1058 secs
mds.pool_tester.dcs1.ixkkfs(mds.0): 4 slow metadata IOs are blocked >
30 secs, oldest blocked for 1058 secs
mds.dc_iso.dcs1.jxqqjd(mds.0): 4 slow metadata IOs are blocked > 30
secs, oldest blocked for 1058 secs
[WRN] PG_AVAILABILITY: Reduced data availability: 1153 pgs inactive, 1101
pgs peering
pg 6.c3 is stuck inactive for 50m, current state peering, last acting
[30,15,11]
pg 6.c4 is stuck peering for 10h, current state peering, last acting
[12,0,26]
pg 6.c5 is stuck peering for 10h, current state peering, last acting
[12,32,6]
pg 6.c6 is stuck peering for 11h, current state peering, last acting
[30,4,22]
pg 6.c7 is stuck peering for 10h, current state peering, last acting
[4,14,26]
pg 6.c8 is stuck peering for 10h, current state peering, last acting
[0,22,32]
pg 6.c9 is stuck peering for 11h, current state peering, last acting
[32,20,0]
pg 6.ca is stuck peering for 11h, current state peering, last acting
[31,0,23]
pg 6.cb is stuck peering for 10h, current state peering, last acting
[8,35,16]
pg 6.cc is stuck peering for 10h, current state peering, last acting
[8,24,13]
pg 6.cd is stuck peering for 10h, current state peering, last acting
[15,25,1]
pg 6.ce is stuck peering for 11h, current state peering, last acting
[27,23,4]
pg 6.cf is stuck peering for 11h, current state peering, last acting
[25,4,20]
pg 7.c4 is stuck peering for 11m, current state remapped+peering, last
acting [19,8]
pg 7.c5 is stuck peering for 10h, current state peering, last acting
[6,14,32]
pg 7.c6 is stuck peering for 10h, current state peering, last acting
[14,35,5]
pg 7.c7 is stuck peering for 10h, current state remapped+peering, last
acting [11,14]
pg 7.c8 is stuck peering for 10h, current state peering, last acting
[21,9,28]
pg 7.c9 is stuck peering for 10h, current state peering, last acting
[0,30,15]
pg 7.ca is stuck peering for 10h, current state peering, last acting
[23,2,26]
pg 7.cb is stuck peering for 10h, current state peering, last acting
[23,9,24]
pg 7.cc is stuck peering for 10h, current state peering, last acting
[23,27,0]
pg 7.cd is stuck peering for 11m, current state remapped+peering, last
acting [13,6]
pg 7.ce is stuck peering for 10h, current state peering, last acting
[16,1,25]
pg 7.cf is stuck peering for 11h, current state peering, last acting
[24,16,8]
pg 9.c0 is stuck peering for 10h, current state peering, last acting
[21,28]
pg 9.c1 is stuck peering for 10h, current state peering, last acting
[12,31]
pg 9.c2 is stuck peering for 10h, current state peering, last acting
[6,27]
pg 9.c3 is stuck peering for 10h, current state peering, last acting
[9,27]
pg 9.c4 is stuck peering for 50m, current state peering, last acting
[17,34]
pg 9.c5 is stuck peering for 11h, current state peering, last acting
[31,8]
pg 9.c6 is stuck peering for 10h, current state peering, last acting
[1,29]
pg 9.c7 is stuck peering for 10h, current state peering, last acting
[12,30]
pg 9.c8 is stuck peering for 11h, current state peering, last acting
[26,3]
pg 9.c9 is stuck peering for 11h, current state peering, last acting
[29,13]
pg 9.ca is stuck peering for 11h, current state peering, last acting
[25,6]
pg 9.cb is stuck peering for 10h, current state peering, last acting
[16,9]
pg 9.cc is stuck peering for 4h, current state peering, last acting
[4,29]
pg 10.c0 is stuck peering for 11h, current state peering, last acting
[32,19]
pg 10.c1 is stuck peering for 10h, current state peering, last acting
[23,6]
pg 10.c2 is stuck peering for 11h, current state peering, last acting
[24,7]
pg 10.c3 is stuck peering for 38m, current state peering, last acting
[5,20]
pg 10.c4 is stuck peering for 10h, current state peering, last acting
[21,4]
pg 10.c5 is stuck peering for 10h, current state peering, last acting
[12,8]
pg 10.c6 is stuck peering for 11h, current state peering, last acting
[34,7]
pg 10.c7 is stuck peering for 10h, current state peering, last acting
[17,30]
pg 10.c8 is stuck peering for 11h, current state peering, last acting
[24,19]
pg 10.c9 is stuck inactive for 54m, current state activating, last
acting [13,3]
pg 10.ca is stuck peering for 10h, current state peering, last acting
[16,6]
pg 10.cb is stuck peering for 11h, current state peering, last acting
[26,13]
pg 10.cf is stuck peering for 50m, current state peering, last acting
[21,24]
[WRN] SLOW_OPS: 26 slow ops, oldest one blocked for 673 sec, daemons
[osd.10,osd.13,osd.14,osd.15,osd.16,osd.18,osd.20,osd.21,osd.24,osd.25]...
have slow ops.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]