Ceph currently isn't very smart on ordering the balancing operations. It
can fill a disk before moving some things off of it. So if you are close to
the toofull line, it can push that OSD over. I think there is a blueprint
to help with this being worked on for Hammer.

You have a couple of options. You can try to bump up the full limit and see
if it unblocks it and moves off the PGS that are stuck, then drop it back
down. Don't go above 98% though. You could also try reducing the size of
one ore more pools and then after the cluster settles, increase it to the
original size. You could also try deleting some of the PGs manually in the
OSD file system to get it under the full line (I'm not exactly on the steps
on this, but they have been discussed on the mailing list in the last
couple of months).

Good Luck!

On Mon, Jan 5, 2015 at 12:41 PM, ivan babrou <ibob...@gmail.com> wrote:

> Rebalancing is almost finished, but things got even worse:
> http://i.imgur.com/0HOPZil.png
>
> Moreover, one pg is in active+remapped+wait_backfill+backfill_toofull
> state:
>
> 2015-01-05 19:39:31.995665 mon.0 [INF] pgmap v3979616: 5832 pgs: 23
> active+remapped+wait_backfill, 1
> active+remapped+wait_backfill+backfill_toofull, 2
> active+remapped+backfilling, 5805 active+clean, 1
> active+remapped+backfill_toofull; 11210 GB data, 26174 GB used, 18360 GB /
> 46906 GB avail; 65246/10590590 objects degraded (0.616%)
>
> So at 55.8% disk space utilization ceph is full. That doesn't look very
> well.
>
> On 5 January 2015 at 15:39, ivan babrou <ibob...@gmail.com> wrote:
>
>>
>>
>> On 5 January 2015 at 14:20, Christian Balzer <ch...@gol.com> wrote:
>>
>>> On Mon, 5 Jan 2015 14:04:28 +0400 ivan babrou wrote:
>>>
>>> > Hi!
>>> >
>>> > I have a cluster with 106 osds and disk usage is varying from 166gb to
>>> > 316gb. Disk usage is highly correlated to number of pg per osd (no
>>> > surprise here). Is there a reason for ceph to allocate more pg on some
>>> > nodes?
>>> >
>>> In essence what Wido said, you're a bit low on PGs.
>>>
>>> Also given your current utilization, pool 14 is totally oversize with
>>> 1024
>>> PGs. You might want to re-create it with a smaller size and double pool 0
>>> to 512 PGs and 10 to 4096.
>>> I assume you did raise the PGPs as well when changing the PGs, right?
>>>
>>
>> Yep, pg = pgp for all pools. Pool 14 is just for testing purposes, it
>> might get large eventually.
>>
>> I followed you advice in doubling pools 0 and 10. It is rebalancing at
>> 30% degraded now, but so far big osds become bigger and small become
>> smaller: http://i.imgur.com/hJcX9Us.png. I hope that trend would change
>> before rebalancing is complete.
>>
>>
>>> And yeah, CEPH isn't particular good at balancing stuff by itself, but
>>> with sufficient PGs you ought to get the variance below/around 30%.
>>>
>>
>> Is this going to change in the future releases?
>>
>>
>>> Christian
>>>
>>> > The biggest osds are 30, 42 and 69 (300gb+ each) and the smallest are
>>> 87,
>>> > 33 and 55 (170gb each). The biggest pool has 2048 pgs, pools with very
>>> > little data has only 8 pgs. PG size in biggest pool is ~6gb (5.1..6.3
>>> > actually).
>>> >
>>> > Lack of balanced disk usage prevents me from using all the disk space.
>>> > When the biggest osd is full, cluster does not accept writes anymore.
>>> >
>>> > Here's gist with info about my cluster:
>>> > https://gist.github.com/bobrik/fb8ad1d7c38de0ff35ae
>>> >
>>>
>>>
>>> --
>>> Christian Balzer        Network/Systems Engineer
>>> ch...@gol.com           Global OnLine Japan/Fusion Communications
>>> http://www.gol.com/
>>>
>>
>>
>>
>> --
>> Regards, Ian Babrou
>> http://bobrik.name http://twitter.com/ibobrik skype:i.babrou
>>
>
>
>
> --
> Regards, Ian Babrou
> http://bobrik.name http://twitter.com/ibobrik skype:i.babrou
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to