Hi Ceph Admins,
This night our ceph cluster got all pools 100% full. This happend after
osd.56 (95% used) reached OSD_FULL state.
ceph versions 12.2.2
Logs
2018-03-03 17:15:22.560710 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224452
: cluster [ERR] overall HEALTH_ERR noscrub,nodeep-scrub flag(s) set; 1
backfillfull osd(s); 5 nearfull osd(s); 21 pool(s) backfillfull;
638551/287271738 objects misplaced (0.222%); Degraded data redundancy:
253066/287271738 objects degraded (0.088%), 25 pgs unclean; Degraded data
redundancy (low space): 25 pgs backfill_toofull
2018-03-03 17:15:42.513194 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224515
: cluster [WRN] Health check update: 638576/287284518 objects misplaced
(0.222%) (OBJECT_MISPLACED)
2018-03-03 17:15:42.513256 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224516
: cluster [WRN] Health check update: Degraded data redundancy:
253266/287284518 objects degraded (0.088%), 25 pgs unclean (PG_DEGRADED)
2018-03-03 17:15:44.684928 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224524
: cluster [ERR] Health check failed: 1 full osd(s) (OSD_FULL)
2018-03-03 17:15:44.684969 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224525
: cluster [WRN] Health check failed: 21 pool(s) full (POOL_FULL)
2018-03-03 17:15:44.684987 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224526
: cluster [INF] Health check cleared: OSD_BACKFILLFULL (was: 1 backfillfull
osd(s))
2018-03-03 17:15:44.685013 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224527
: cluster [INF] Health check cleared: POOL_BACKFILLFULL (was: 21 pool(s)
backfillfull)
# ceph df detail from crush time
GLOBAL:
SIZE AVAIL RAW USED %RAW USED OBJECTS
381T 102T 278T 73.05 38035k
POOLS:
NAME ID QUOTA OBJECTS QUOTA BYTES
USED %USED MAX AVAIL OBJECTS DIRTY READ
WRITE RAW USED
rbd 0 N/A N/A
0 0 0 0 0 1
134k 0
vms 1 N/A N/A
0 0 0 0 0 0
0 0
images 2 N/A N/A
7659M 100.00 0 1022 1022 110k
5668 22977M
volumes 3 N/A N/A
40991G 100.00 0 10514980 10268k 3404M
4087M 120T
.rgw.root 4 N/A N/A
1588 100.00 0 4 4 402k
4 4764
default.rgw.control 5 N/A N/A
0 0 0 8 8 0
0 0
default.rgw.data.root 6 N/A N/A
94942 100.00 0 339 339 257k
6422 278k
default.rgw.gc 7 N/A N/A
0 0 0 32 32 3125M
7410k 0
default.rgw.log 8 N/A N/A
0 0 0 186 186 27222k
18146k 0
default.rgw.users.uid 9 N/A N/A
4252 100.00 0 17 17 262k
64561 12756
default.rgw.usage 10 N/A N/A
0 0 0 8 8 332k
665k 0
default.rgw.users.email 11 N/A N/A
87 100.00 0 4 4 0
4 261
default.rgw.users.keys 12 N/A N/A
206 100.00 0 11 11 459
23 618
default.rgw.users.swift 13 N/A N/A
40 100.00 0 3 3 0
3 120
default.rgw.buckets.index 14 N/A N/A
0 0 0 210 210 321M
41709k 0
default.rgw.buckets.non-ec 16 N/A N/A
0 0 0 114 114 18006
12055 0
default.rgw.buckets.extra 17 N/A N/A
0 0 0 0 0 0
0 0
.rgw.buckets.extra 18 N/A N/A
0 0 0 0 0 0
0 0
default.rgw.buckets.data 20 N/A N/A
104T 100.00 0 28334451 27670k 160M
156M 156T
benchmark_replicated 21 N/A N/A
87136M 100.00 0 21792 21792 1450k
4497k 255G
benchmark_erasure_coded 22 N/A N/A
292G 100.00 0 74779 74779 61288
680k 439G
#
What we did to reclaim some space is:
- deleted two benchmark pools
- reweight full osd.56 from 1.0 to 0.85
- added new node - cephnode10 (cluster has grown from 9 to 10 nodes but I
had to do crush reweight down to 0 on new OSDs as a lot of slow requestes
(like 3000+) occured and customer IOPS went totally down. Adding one OSD at
a time now)
Current status
# ceph -s
cluster:
id: 1023c49f-3a10-42de-9f62-9b122db32f1f
health: HEALTH_ERR
noscrub,nodeep-scrub flag(s) set
5 nearfull osd(s)
19 pool(s) nearfull
16151257/286563963 objects misplaced (5.636%)
Degraded data redundancy: 20949/286563963 objects degraded
(0.007%), 431 pgs unclean, 28 pgs degraded, 1 pg undersized
Degraded data redundancy (low space): 15 pgs backfill_toofull
services:
mon: 3 daemons, quorum cephnode01,cephnode02,cephnode03
mgr: cephnode02(active), standbys: cephnode03, cephnode01
osd: 120 osds: 117 up, 117 in; 405 remapped pgs
flags noscrub,nodeep-scrub
rgw: 3 daemons active
data:
pools: 19 pools, 3760 pgs
objects: 37941k objects, 144 TB
usage: 278 TB used, 146 TB / 425 TB avail
pgs: 20949/286563963 objects degraded (0.007%)
16151257/286563963 objects misplaced (5.636%)
3329 active+clean
370 active+remapped+backfill_wait
26 active+recovery_wait+degraded
18 active+remapped+backfilling
15 active+remapped+backfill_wait+backfill_toofull
1 active+recovery_wait+degraded+remapped
1 active+undersized+degraded+remapped+backfilling
io:
client: 18337 B/s rd, 29269 kB/s wr, 1 op/s rd, 234 op/s wr
recovery: 946 MB/s, 243 objects/s
#
# ceph df detail
GLOBAL:
SIZE AVAIL RAW USED %RAW USED OBJECTS
425T 146T 278T 65.50 37941k
POOLS:
NAME ID QUOTA OBJECTS QUOTA BYTES
USED %USED MAX AVAIL OBJECTS DIRTY READ
WRITE RAW USED
rbd 0 N/A N/A
0 0 7415G 0 0 1
134k 0
vms 1 N/A N/A
0 0 7415G 0 0 0
0 0
images 2 N/A N/A
7659M 0.10 7415G 1022 1022 110k
5668 22445M
volumes 3 N/A N/A
40992G 84.68 7415G 10515231 10268k 3416M
4090M 120T
.rgw.root 4 N/A N/A
1588 0 7415G 4 4 141k
4 4764
default.rgw.control 5 N/A N/A
0 0 7415G 8 8 0
0 0
default.rgw.data.root 6 N/A N/A
94942 0 7415G 339 339 257k
6422 278k
default.rgw.gc 7 N/A N/A
0 0 7415G 32 32 3125M
7430k 0
default.rgw.log 8 N/A N/A
0 0 7415G 186 186 27249k
18164k 0
default.rgw.users.uid 9 N/A N/A
4252 0 7415G 17 17 263k
64577 12756
default.rgw.usage 10 N/A N/A
0 0 7415G 8 8 332k
665k 0
default.rgw.users.email 11 N/A N/A
87 0 7415G 4 4 0
4 261
default.rgw.users.keys 12 N/A N/A
206 0 7415G 11 11 483
23 580
default.rgw.users.swift 13 N/A N/A
40 0 7415G 3 3 0
3 120
default.rgw.buckets.index 14 N/A N/A
0 0 7415G 210 210 321M
41709k 0
default.rgw.buckets.non-ec 16 N/A N/A
0 0 7415G 114 114 18006
12055 0
default.rgw.buckets.extra 17 N/A N/A
0 0 7415G 0 0 0
0 0
.rgw.buckets.extra 18 N/A N/A
0 0 7415G 0 0 0
0 0
default.rgw.buckets.data 20 N/A N/A
104T 87.85 14831G 28334711 27670k 160M
156M 157T
#
Most utilized pools are: volumes (replicated pool) and
default.rgw.buckets.data (EC pool, k=6,m=3)
pool 3 'volumes' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 1024 pgp_num 1024 last_change 10047 flags
hashpspool,backfillfull stripe_width 0 application rbd
removed_snaps [1~3]
pool 20 'default.rgw.buckets.data' erasure size 9 min_size 6 crush_rule 1
object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 10047 flags
hashpspool,backfillfull stripe_width 4224 application rgw
Crush rules for above pools:
# rules
rule replicated_ruleset {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type rack # !!! rack as failure domain
step emit
}
rule ec_rule_k6_m3 {
id 1
type erasure
min_size 3
max_size 9
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default
step chooseleaf indep 0 type host # !!! host as failure domain
step emit
}
And finally cluster topology
# ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-1 392.72797 - 425T 278T 146T 65.51 1.00 - root
default
-6 392.72797 - 425T 278T 146T 65.51 1.00 - region
region01
-5 392.72797 - 425T 278T 146T 65.51 1.00 -
datacenter dc01
-4 392.72797 - 425T 278T 146T 65.51 1.00 -
room room01
-8 43.63699 - 44684G 31703G 12980G 70.95 1.08 -
rack rack01
-7 43.63699 - 44684G 31703G 12980G 70.95 1.08 -
host cephnode01
0 hdd 3.63599 1.00000 3723G 2957G 765G 79.43 1.21 178
osd.0
2 hdd 3.63599 1.00000 3723G 2407G 1315G 64.66 0.99 157
osd.2
4 hdd 3.63599 1.00000 3723G 2980G 742G 80.05 1.22 184
osd.4
6 hdd 3.63599 1.00000 3723G 2768G 955G 74.34 1.13 170
osd.6
8 hdd 3.63599 1.00000 3723G 2704G 1019G 72.62 1.11 172
osd.8
11 hdd 3.63599 1.00000 3723G 2899G 824G 77.87 1.19 181
osd.11
12 hdd 3.63599 1.00000 3723G 2788G 935G 74.89 1.14 183
osd.12
14 hdd 3.63599 1.00000 3723G 2139G 1584G 57.44 0.88 139
osd.14
16 hdd 3.63599 1.00000 3723G 2672G 1050G 71.78 1.10 174
osd.16
18 hdd 3.63599 1.00000 3723G 2575G 1148G 69.17 1.06 166
osd.18
20 hdd 3.63599 1.00000 3723G 2395G 1328G 64.33 0.98 149
osd.20
22 hdd 3.63599 1.00000 3723G 2414G 1309G 64.83 0.99 161
osd.22
-3 43.63699 - 44684G 32329G 12354G 72.35 1.10 -
rack rack02
-2 43.63699 - 44684G 32329G 12354G 72.35 1.10 -
host cephnode02
1 hdd 3.63599 1.00000 3723G 2874G 848G 77.21 1.18 172
osd.1
3 hdd 3.63599 1.00000 3723G 3287G 436G 88.27 1.35 190
osd.3
5 hdd 3.63599 1.00000 3723G 2588G 1135G 69.50 1.06 151
osd.5
7 hdd 3.63599 1.00000 3723G 2566G 1156G 68.94 1.05 156
osd.7
9 hdd 3.63599 1.00000 3723G 2481G 1242G 66.65 1.02 164
osd.9
10 hdd 3.63599 1.00000 3723G 2622G 1101G 70.43 1.08 156
osd.10
13 hdd 3.63599 1.00000 3723G 2498G 1225G 67.08 1.02 150
osd.13
15 hdd 3.63599 1.00000 3723G 2664G 1058G 71.56 1.09 167
osd.15
17 hdd 3.63599 1.00000 3723G 2510G 1213G 67.42 1.03 163
osd.17
19 hdd 3.63599 1.00000 3723G 2562G 1161G 68.82 1.05 162
osd.19
21 hdd 3.63599 1.00000 3723G 2683G 1040G 72.05 1.10 169
osd.21
23 hdd 3.63599 1.00000 3723G 2989G 734G 80.28 1.23 169
osd.23
-10 43.63699 - 44684G 32556G 12128G 72.86 1.11 -
rack rack03
-9 43.63699 - 44684G 32556G 12128G 72.86 1.11 -
host cephnode03
24 hdd 3.63599 1.00000 3723G 2757G 966G 74.05 1.13 155
osd.24
25 hdd 3.63599 1.00000 3723G 3003G 720G 80.66 1.23 186
osd.25
26 hdd 3.63599 1.00000 3723G 2494G 1229G 66.98 1.02 168
osd.26
28 hdd 3.63599 1.00000 3723G 3021G 701G 81.15 1.24 180
osd.28
30 hdd 3.63599 1.00000 3723G 2554G 1169G 68.60 1.05 164
osd.30
32 hdd 3.63599 1.00000 3723G 2060G 1662G 55.34 0.84 147
osd.32
34 hdd 3.63599 1.00000 3723G 3131G 592G 84.08 1.28 181
osd.34
36 hdd 3.63599 1.00000 3723G 2512G 1211G 67.47 1.03 162
osd.36
38 hdd 3.63599 1.00000 3723G 2408G 1315G 64.68 0.99 157
osd.38
40 hdd 3.63599 1.00000 3723G 2997G 726G 80.49 1.23 194
osd.40
42 hdd 3.63599 1.00000 3723G 2645G 1078G 71.05 1.08 161
osd.42
44 hdd 3.63599 1.00000 3723G 2969G 754G 79.74 1.22 173
osd.44
-12 43.63699 - 44684G 32504G 12179G 72.74 1.11 -
rack rack04
-11 43.63699 - 44684G 32504G 12179G 72.74 1.11 -
host cephnode04
27 hdd 3.63599 1.00000 3723G 2947G 775G 79.16 1.21 186
osd.27
29 hdd 3.63599 1.00000 3723G 3095G 628G 83.13 1.27 175
osd.29
31 hdd 3.63599 1.00000 3723G 2514G 1209G 67.52 1.03 163
osd.31
33 hdd 3.63599 1.00000 3723G 2557G 1166G 68.68 1.05 160
osd.33
35 hdd 3.63599 1.00000 3723G 3215G 508G 86.35 1.32 183
osd.35
37 hdd 3.63599 1.00000 3723G 2455G 1268G 65.93 1.01 151
osd.37
39 hdd 3.63599 1.00000 3723G 2335G 1387G 62.73 0.96 155
osd.39
41 hdd 3.63599 1.00000 3723G 2774G 949G 74.51 1.14 165
osd.41
43 hdd 3.63599 1.00000 3723G 2764G 959G 74.24 1.13 169
osd.43
45 hdd 3.63599 1.00000 3723G 2553G 1169G 68.59 1.05 163
osd.45
46 hdd 3.63599 1.00000 3723G 2645G 1077G 71.06 1.08 167
osd.46
47 hdd 3.63599 1.00000 3723G 2644G 1079G 71.02 1.08 156
osd.47
-14 39.99585 - 33513G 27770G 5742G 82.86 1.26 -
rack rack05
-13 39.99585 - 33513G 27770G 5742G 82.86 1.26 -
host cephnode05
48 hdd 3.63599 0.90002 3723G 3310G 413G 88.89 1.36 211
osd.48
49 hdd 3.63599 0.80005 3723G 3029G 694G 81.36 1.24 182
osd.49
50 hdd 3.63599 0.85004 3723G 2918G 804G 78.38 1.20 167
osd.50
51 hdd 3.63599 0.85004 3723G 3103G 620G 83.33 1.27 186
osd.51
52 hdd 0 0 0 0 0 0 0 0
osd.52
53 hdd 3.63599 0 0 0 0 0 0 0
osd.53
54 hdd 3.63599 0 0 0 0 0 0 0
osd.54
55 hdd 3.63599 0.85004 3723G 3003G 720G 80.65 1.23 178
osd.55
56 hdd 3.63599 0.84999 3723G 3347G 376G 89.89 1.37 189
osd.56
57 hdd 3.63599 0.75006 3723G 2707G 1016G 72.71 1.11 161
osd.57
58 hdd 3.63599 0.80005 3723G 3228G 495G 86.71 1.32 186
osd.58
59 hdd 3.63599 0.80005 3723G 3122G 601G 83.85 1.28 194
osd.59
-16 43.63699 - 44684G 33402G 11281G 74.75 1.14 -
rack rack06
-15 43.63699 - 44684G 33402G 11281G 74.75 1.14 -
host cephnode06
60 hdd 3.63599 1.00000 3723G 2317G 1406G 62.22 0.95 149
osd.60
61 hdd 3.63599 1.00000 3723G 3039G 684G 81.62 1.25 183
osd.61
62 hdd 3.63599 1.00000 3723G 2945G 778G 79.09 1.21 189
osd.62
63 hdd 3.63599 1.00000 3723G 2923G 800G 78.50 1.20 166
osd.63
64 hdd 3.63599 1.00000 3723G 3057G 665G 82.11 1.25 180
osd.64
65 hdd 3.63599 1.00000 3723G 2989G 733G 80.30 1.23 170
osd.65
66 hdd 3.63599 1.00000 3723G 2764G 959G 74.25 1.13 166
osd.66
67 hdd 3.63599 1.00000 3723G 2811G 912G 75.50 1.15 175
osd.67
68 hdd 3.63599 1.00000 3723G 1785G 1938G 47.95 0.73 139
osd.68
69 hdd 3.63599 1.00000 3723G 2744G 979G 73.69 1.12 159
osd.69
70 hdd 3.63599 1.00000 3723G 3068G 655G 82.40 1.26 178
osd.70
71 hdd 3.63599 1.00000 3723G 2956G 767G 79.40 1.21 174
osd.71
-18 43.63699 - 44684G 33524G 11159G 75.03 1.15 -
rack rack07
-17 43.63699 - 44684G 33524G 11159G 75.03 1.15 -
host cephnode07
72 hdd 3.63599 1.00000 3723G 2901G 822G 77.91 1.19 178
osd.72
73 hdd 3.63599 1.00000 3723G 2612G 1110G 70.16 1.07 168
osd.73
74 hdd 3.63599 1.00000 3723G 2870G 853G 77.09 1.18 172
osd.74
75 hdd 3.63599 1.00000 3723G 2813G 910G 75.56 1.15 169
osd.75
76 hdd 3.63599 1.00000 3723G 2861G 862G 76.85 1.17 170
osd.76
77 hdd 3.63599 1.00000 3723G 2807G 916G 75.39 1.15 168
osd.77
78 hdd 3.63599 1.00000 3723G 2678G 1045G 71.92 1.10 156
osd.78
79 hdd 3.63599 1.00000 3723G 2556G 1166G 68.67 1.05 160
osd.79
80 hdd 3.63599 1.00000 3723G 3082G 640G 82.79 1.26 190
osd.80
81 hdd 3.63599 1.00000 3723G 2418G 1305G 64.94 0.99 144
osd.81
82 hdd 3.63599 1.00000 3723G 2881G 841G 77.39 1.18 161
osd.82
83 hdd 3.63599 1.00000 3723G 3039G 683G 81.64 1.25 175
osd.83
-20 90.91017 - 130T 61630G 72421G 45.98 0.70 -
rack rack08
-19 43.63699 - 44684G 30861G 13823G 69.06 1.05 -
host cephnode08
84 hdd 3.63599 1.00000 3723G 2532G 1190G 68.02 1.04 157
osd.84
85 hdd 3.63599 1.00000 3723G 2518G 1205G 67.64 1.03 166
osd.85
86 hdd 3.63599 1.00000 3723G 2504G 1219G 67.25 1.03 151
osd.86
87 hdd 3.63599 1.00000 3723G 2698G 1024G 72.47 1.11 161
osd.87
88 hdd 3.63599 1.00000 3723G 2527G 1196G 67.87 1.04 147
osd.88
89 hdd 3.63599 1.00000 3723G 2508G 1215G 67.36 1.03 142
osd.89
90 hdd 3.63599 1.00000 3723G 2317G 1406G 62.24 0.95 142
osd.90
91 hdd 3.63599 1.00000 3723G 2582G 1140G 69.36 1.06 147
osd.91
92 hdd 3.63599 1.00000 3723G 2656G 1066G 71.35 1.09 144
osd.92
93 hdd 3.63599 1.00000 3723G 2448G 1275G 65.74 1.00 154
osd.93
94 hdd 3.63599 1.00000 3723G 2783G 939G 74.76 1.14 163
osd.94
95 hdd 3.63599 1.00000 3723G 2782G 941G 74.73 1.14 152
osd.95
-21 43.63678 - 44684G 30331G 14353G 67.88 1.04 -
host cephnode09
96 hdd 3.63640 1.00000 3723G 3003G 719G 80.67 1.23 161
osd.96
97 hdd 3.63640 1.00000 3723G 2581G 1142G 69.32 1.06 151
osd.97
98 hdd 3.63640 1.00000 3723G 2118G 1605G 56.88 0.87 140
osd.98
99 hdd 3.63640 1.00000 3723G 2926G 796G 78.60 1.20 165
osd.99
100 hdd 3.63640 1.00000 3723G 2492G 1231G 66.92 1.02 149
osd.100
101 hdd 3.63640 1.00000 3723G 2605G 1117G 69.98 1.07 165
osd.101
102 hdd 3.63640 1.00000 3723G 2159G 1563G 58.01 0.89 141
osd.102
103 hdd 3.63640 1.00000 3723G 2328G 1395G 62.53 0.95 146
osd.103
104 hdd 3.63640 1.00000 3723G 2624G 1099G 70.48 1.08 163
osd.104
105 hdd 3.63640 1.00000 3723G 2582G 1141G 69.34 1.06 142
osd.105
106 hdd 3.63640 1.00000 3723G 2401G 1322G 64.48 0.98 161
osd.106
107 hdd 3.63640 1.00000 3723G 2507G 1216G 67.33 1.03 159
osd.107
-43 3.63640 - 44684G 438G 44245G 0.98 0.01 -
host cephnode10 ## Added after cluster pools got full
108 hdd 3.63640 1.00000 3723G 51915M 3672G 1.36 0.02 36
osd.108
109 hdd 0 1.00000 3723G 72735M 3652G 1.91 0.03 4
osd.109
110 hdd 0 1.00000 3723G 36948M 3687G 0.97 0.01 2
osd.110
111 hdd 0 1.00000 3723G 37043M 3687G 0.97 0.01 2
osd.111
112 hdd 0 1.00000 3723G 72382M 3652G 1.90 0.03 4
osd.112
113 hdd 0 1.00000 3723G 54850M 3670G 1.44 0.02 3
osd.113
114 hdd 0 1.00000 3723G 36664M 3687G 0.96 0.01 2
osd.114
115 hdd 0 1.00000 3723G 36087M 3688G 0.95 0.01 2
osd.115
116 hdd 0 1.00000 3723G 12066M 3711G 0.32 0.00 0
osd.116
117 hdd 0 1.00000 3723G 36793M 3687G 0.96 0.01 2
osd.117
118 hdd 0 1.00000 3723G 775M 3722G 0.02 0 0
osd.118
119 hdd 0 1.00000 3723G 760M 3722G 0.02 0 0
osd.119
TOTAL 425T 278T 146T 65.51
MIN/MAX VAR: 0/1.37 STDDEV: 23.07
#
I'm wondering why one FULL OSD made all cluster pools full? Does OSD_FULL
state stop write operations to all OSDs on the node that full OSD resides
or just to concerned OSD ?
Should pg_num/pgp_num be increased to get better data balancing across all
OSDs?
Why there is only 7415G MAX AVAIL for volumes pool and 14831G for
default.rgw.buckets.data pool while cluster %RAW USED is 65.50 (only?) ? Is
it somehow related to bad looking node cephnode05 (OSDs highly utilized)
and the fact that K+M of EC pool was equal to the number of nodes in the
cluster?
Best Regards
Jakub
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com