Hi!
Right now, after adding OSD:
# ceph health detail
HEALTH_ERR 74197563/199392333 objects misplaced (37.212%); Degraded data
redundancy (low space): 1 pg backfill_toofull
OBJECT_MISPLACED 74197563/199392333 objects misplaced (37.212%)
PG_DEGRADED_FULL Degraded data redundancy (low space): 1 pg backfill_toofull
pg 6.eb is active+remapped+backfill_wait+backfill_toofull, acting [21,0,47]
# ceph pg ls-by-pool iscsi backfill_toofull
PG OBJECTS DEGRADED MISPLACED UNFOUND BYTES LOG STATE
STATE_STAMP VERSION REPORTED UP
ACTING SCRUB_STAMP DEEP_SCRUB_STAMP
6.eb 645 0 1290 0 1645654016 3067
active+remapped+backfill_wait+backfill_toofull 2019-02-02 00:20:32.975300
7208'6567 9790:16214 [5,1,21]p5 [21,0,47]p21 2019-01-18 04:13:54.280495
2019-01-18 04:13:54.280495
All OSD have less 40% USE.
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 hdd 9.56149 1.00000 9.6 TiB 3.2 TiB 6.3 TiB 33.64 1.31 313
1 hdd 9.56149 1.00000 9.6 TiB 3.3 TiB 6.3 TiB 34.13 1.33 295
5 hdd 9.56149 1.00000 9.6 TiB 756 GiB 8.8 TiB 7.72 0.30 103
47 hdd 9.32390 1.00000 9.3 TiB 3.1 TiB 6.2 TiB 33.75 1.31 306
(all other OSD also have less 40%)
ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
Maybe the developers will pay attention to the letter and say something?
----- Original Message -----
From: "Fyodor Ustinov" <[email protected]>
To: "Caspar Smit" <[email protected]>
Cc: "Jan Kasprzak" <[email protected]>, "ceph-users" <[email protected]>
Sent: Thursday, 31 January, 2019 16:50:24
Subject: Re: [ceph-users] backfill_toofull after adding new OSDs
Hi!
I saw the same several times when I added a new osd to the cluster. One-two pg
in "backfill_toofull" state.
In all versions of mimic.
----- Original Message -----
From: "Caspar Smit" <[email protected]>
To: "Jan Kasprzak" <[email protected]>
Cc: "ceph-users" <[email protected]>
Sent: Thursday, 31 January, 2019 15:43:07
Subject: Re: [ceph-users] backfill_toofull after adding new OSDs
Hi Jan,
You might be hitting the same issue as Wido here:
[ https://www.spinics.net/lists/ceph-users/msg50603.html |
https://www.spinics.net/lists/ceph-users/msg50603.html ]
Kind regards,
Caspar
Op do 31 jan. 2019 om 14:36 schreef Jan Kasprzak < [ mailto:[email protected] |
[email protected] ] >:
Hello, ceph users,
I see the following HEALTH_ERR during cluster rebalance:
Degraded data redundancy (low space): 8 pgs backfill_toofull
Detailed description:
I have upgraded my cluster to mimic and added 16 new bluestore OSDs
on 4 hosts. The hosts are in a separate region in my crush map, and crush
rules prevented data to be moved on the new OSDs. Now I want to move
all data to the new OSDs (and possibly decomission the old filestore OSDs).
I have created the following rule:
# ceph osd crush rule create-replicated on-newhosts newhostsroot host
after this, I am slowly moving the pools one-by-one to this new rule:
# ceph osd pool set test-hdd-pool crush_rule on-newhosts
When I do this, I get the above error. This is misleading, because
ceph osd df does not suggest the OSDs are getting full (the most full
OSD is about 41 % full). After rebalancing is done, the HEALTH_ERR
disappears. Why am I getting this error?
# ceph -s
cluster:
id: ...my UUID...
health: HEALTH_ERR
1271/3803223 objects misplaced (0.033%)
Degraded data redundancy: 40124/3803223 objects degraded (1.055%), 65 pgs
degraded, 67 pgs undersized
Degraded data redundancy (low space): 8 pgs backfill_toofull
services:
mon: 3 daemons, quorum mon1,mon2,mon3
mgr: mon2(active), standbys: mon1, mon3
osd: 80 osds: 80 up, 80 in; 90 remapped pgs
rgw: 1 daemon active
data:
pools: 13 pools, 5056 pgs
objects: 1.27 M objects, 4.8 TiB
usage: 15 TiB used, 208 TiB / 224 TiB avail
pgs: 40124/3803223 objects degraded (1.055%)
1271/3803223 objects misplaced (0.033%)
4963 active+clean
41 active+recovery_wait+undersized+degraded+remapped
21 active+recovery_wait+undersized+degraded
17 active+remapped+backfill_wait
5 active+remapped+backfill_wait+backfill_toofull
3 active+remapped+backfill_toofull
2 active+recovering+undersized+remapped
2 active+recovering+undersized+degraded+remapped
1 active+clean+remapped
1 active+recovering+undersized+degraded
io:
client: 6.6 MiB/s rd, 2.7 MiB/s wr, 75 op/s rd, 89 op/s wr
recovery: 2.0 MiB/s, 92 objects/s
Thanks for any hint,
-Yenya
--
| Jan "Yenya" Kasprzak <kas at { [ http://fi.muni.cz/ | fi.muni.cz ] - work | [
http://yenya.net/ | yenya.net ] - private}> |
| [ http://www.fi.muni.cz/~kas/ | http://www.fi.muni.cz/~kas/ ] GPG:
4096R/A45477D5 |
This is the world we live in: the way to deal with computers is to google
the symptoms, and hope that you don't have to watch a video. --P. Zaitcev
_______________________________________________
ceph-users mailing list
[ mailto:[email protected] | [email protected] ]
[ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com