Re: [ceph-users] Different disk usage on different OSDs

ivan babrou Mon, 05 Jan 2015 11:42:06 -0800

Rebalancing is almost finished, but things got even worse:
http://i.imgur.com/0HOPZil.png


Moreover, one pg is in active+remapped+wait_backfill+backfill_toofull state:

2015-01-05 19:39:31.995665 mon.0 [INF] pgmap v3979616: 5832 pgs: 23
active+remapped+wait_backfill, 1
active+remapped+wait_backfill+backfill_toofull, 2
active+remapped+backfilling, 5805 active+clean, 1
active+remapped+backfill_toofull; 11210 GB data, 26174 GB used, 18360 GB /
46906 GB avail; 65246/10590590 objects degraded (0.616%)

So at 55.8% disk space utilization ceph is full. That doesn't look very
well.

On 5 January 2015 at 15:39, ivan babrou <[email protected]> wrote:

>
>
> On 5 January 2015 at 14:20, Christian Balzer <[email protected]> wrote:
>
>> On Mon, 5 Jan 2015 14:04:28 +0400 ivan babrou wrote:
>>
>> > Hi!
>> >
>> > I have a cluster with 106 osds and disk usage is varying from 166gb to
>> > 316gb. Disk usage is highly correlated to number of pg per osd (no
>> > surprise here). Is there a reason for ceph to allocate more pg on some
>> > nodes?
>> >
>> In essence what Wido said, you're a bit low on PGs.
>>
>> Also given your current utilization, pool 14 is totally oversize with 1024
>> PGs. You might want to re-create it with a smaller size and double pool 0
>> to 512 PGs and 10 to 4096.
>> I assume you did raise the PGPs as well when changing the PGs, right?
>>
>
> Yep, pg = pgp for all pools. Pool 14 is just for testing purposes, it
> might get large eventually.
>
> I followed you advice in doubling pools 0 and 10. It is rebalancing at 30%
> degraded now, but so far big osds become bigger and small become smaller:
> http://i.imgur.com/hJcX9Us.png. I hope that trend would change before
> rebalancing is complete.
>
>
>> And yeah, CEPH isn't particular good at balancing stuff by itself, but
>> with sufficient PGs you ought to get the variance below/around 30%.
>>
>
> Is this going to change in the future releases?
>
>
>> Christian
>>
>> > The biggest osds are 30, 42 and 69 (300gb+ each) and the smallest are
>> 87,
>> > 33 and 55 (170gb each). The biggest pool has 2048 pgs, pools with very
>> > little data has only 8 pgs. PG size in biggest pool is ~6gb (5.1..6.3
>> > actually).
>> >
>> > Lack of balanced disk usage prevents me from using all the disk space.
>> > When the biggest osd is full, cluster does not accept writes anymore.
>> >
>> > Here's gist with info about my cluster:
>> > https://gist.github.com/bobrik/fb8ad1d7c38de0ff35ae
>> >
>>
>>
>> --
>> Christian Balzer        Network/Systems Engineer
>> [email protected]           Global OnLine Japan/Fusion Communications
>> http://www.gol.com/
>>
>
>
>
> --
> Regards, Ian Babrou
> http://bobrik.name http://twitter.com/ibobrik skype:i.babrou
>



-- 
Regards, Ian Babrou
http://bobrik.name http://twitter.com/ibobrik skype:i.babrou

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Different disk usage on different OSDs

Reply via email to