Fyodor Ustinov wrote:
: Hi!
:
: I saw the same several times when I added a new osd to the cluster. One-two
pg in "backfill_toofull" state.
:
: In all versions of mimic.
Yep. In my case it is not (only) after adding the new OSDs.
An hour or so ago my cluster reached the HEALTH_OK state, so I moved
another pool to the new hosts with "crush_rule on-newhosts". The result
was immediate backfill_toofull on two PGs for about five minutes,
and then it reached the HEALTH_OK again.
So the PGs are not stuck in that state forever, they are there
only during the data reshuffle.
13.2.4 on CentOS 7.
-Yenya
:
: ----- Original Message -----
: From: "Caspar Smit" <[email protected]>
: To: "Jan Kasprzak" <[email protected]>
: Cc: "ceph-users" <[email protected]>
: Sent: Thursday, 31 January, 2019 15:43:07
: Subject: Re: [ceph-users] backfill_toofull after adding new OSDs
:
: Hi Jan,
:
: You might be hitting the same issue as Wido here:
:
: [ https://www.spinics.net/lists/ceph-users/msg50603.html |
https://www.spinics.net/lists/ceph-users/msg50603.html ]
:
: Kind regards,
: Caspar
:
: Op do 31 jan. 2019 om 14:36 schreef Jan Kasprzak < [ mailto:[email protected] |
[email protected] ] >:
:
:
: Hello, ceph users,
:
: I see the following HEALTH_ERR during cluster rebalance:
:
: Degraded data redundancy (low space): 8 pgs backfill_toofull
:
: Detailed description:
: I have upgraded my cluster to mimic and added 16 new bluestore OSDs
: on 4 hosts. The hosts are in a separate region in my crush map, and crush
: rules prevented data to be moved on the new OSDs. Now I want to move
: all data to the new OSDs (and possibly decomission the old filestore OSDs).
: I have created the following rule:
:
: # ceph osd crush rule create-replicated on-newhosts newhostsroot host
:
: after this, I am slowly moving the pools one-by-one to this new rule:
:
: # ceph osd pool set test-hdd-pool crush_rule on-newhosts
:
: When I do this, I get the above error. This is misleading, because
: ceph osd df does not suggest the OSDs are getting full (the most full
: OSD is about 41 % full). After rebalancing is done, the HEALTH_ERR
: disappears. Why am I getting this error?
:
: # ceph -s
: cluster:
: id: ...my UUID...
: health: HEALTH_ERR
: 1271/3803223 objects misplaced (0.033%)
: Degraded data redundancy: 40124/3803223 objects degraded (1.055%), 65 pgs
degraded, 67 pgs undersized
: Degraded data redundancy (low space): 8 pgs backfill_toofull
:
: services:
: mon: 3 daemons, quorum mon1,mon2,mon3
: mgr: mon2(active), standbys: mon1, mon3
: osd: 80 osds: 80 up, 80 in; 90 remapped pgs
: rgw: 1 daemon active
:
: data:
: pools: 13 pools, 5056 pgs
: objects: 1.27 M objects, 4.8 TiB
: usage: 15 TiB used, 208 TiB / 224 TiB avail
: pgs: 40124/3803223 objects degraded (1.055%)
: 1271/3803223 objects misplaced (0.033%)
: 4963 active+clean
: 41 active+recovery_wait+undersized+degraded+remapped
: 21 active+recovery_wait+undersized+degraded
: 17 active+remapped+backfill_wait
: 5 active+remapped+backfill_wait+backfill_toofull
: 3 active+remapped+backfill_toofull
: 2 active+recovering+undersized+remapped
: 2 active+recovering+undersized+degraded+remapped
: 1 active+clean+remapped
: 1 active+recovering+undersized+degraded
:
: io:
: client: 6.6 MiB/s rd, 2.7 MiB/s wr, 75 op/s rd, 89 op/s wr
: recovery: 2.0 MiB/s, 92 objects/s
:
: Thanks for any hint,
:
: -Yenya
:
: --
: | Jan "Yenya" Kasprzak <kas at { [ http://fi.muni.cz/ | fi.muni.cz ] - work |
[ http://yenya.net/ | yenya.net ] - private}> |
: | [ http://www.fi.muni.cz/~kas/ | http://www.fi.muni.cz/~kas/ ] GPG:
4096R/A45477D5 |
: This is the world we live in: the way to deal with computers is to google
: the symptoms, and hope that you don't have to watch a video. --P. Zaitcev
: _______________________________________________
: ceph-users mailing list
: [ mailto:[email protected] | [email protected] ]
: [ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]
:
: _______________________________________________
: ceph-users mailing list
: [email protected]
: http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 |
This is the world we live in: the way to deal with computers is to google
the symptoms, and hope that you don't have to watch a video. --P. Zaitcev
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com