Re: [ceph-users] Need help for PG problem

2016-03-23 Thread Matt Conner
Hi Zhang,

In a 2 copy pool, each placement group is spread across 2 OSDs - that is
why you see such a high number of placement groups per OSD. There is a PG
calculator at http://ceph.com/pgcalc/. Based on your setup, it may be worth
using 2048 instead of 4096.

As for stuck/degraded PGs, most are reporting as being on osd.0. Looking at
your OSD Tree, you somehow have 21 OSDs being reported with 2 being labeled
as osd.0; both up and in. I'd recommend trying to get rid of the one listed
on host 148_96 and see if it clears the issues.



On Tue, Mar 22, 2016 at 6:28 AM, Zhang Qiang  wrote:

> Hi Reddy,
> It's over a thousand lines, I pasted it on gist:
> https://gist.github.com/dotSlashLu/22623b4cefa06a46e0d4
>
> On Tue, 22 Mar 2016 at 18:15 M Ranga Swami Reddy 
> wrote:
>
>> Hi,
>> Can you please share the "ceph health detail" output?
>>
>> Thanks
>> Swami
>>
>> On Tue, Mar 22, 2016 at 3:32 PM, Zhang Qiang 
>> wrote:
>> > Hi all,
>> >
>> > I have 20 OSDs and 1 pool, and, as recommended by the
>> > doc(http://docs.ceph.com/docs/master/rados/operations/placement-groups/),
>> I
>> > configured pg_num and pgp_num to 4096, size 2, min size 1.
>> >
>> > But ceph -s shows:
>> >
>> > HEALTH_WARN
>> > 534 pgs degraded
>> > 551 pgs stuck unclean
>> > 534 pgs undersized
>> > too many PGs per OSD (382 > max 300)
>> >
>> > Why the recommended value, 4096, for 10 ~ 50 OSDs doesn't work?  And
>> what
>> > does it mean by "too many PGs per OSD (382 > max 300)"? If per OSD has
>> 382
>> > PGs I would have had 7640 PGs.
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Kernel RBD hang on OSD Failure

2015-12-10 Thread Matt Conner
 12 12 3.62999 1
osd.116 29 10 10 3.62999 1
osd.117 0 0 0 3.62999 1
osd.118 48 18 18 3.62999 1
osd.119 0 0 0 3.62999 1
osd.120 36 12 12 3.62999 1
osd.121 0 0 0 3.62999 1
osd.122 42 20 20 3.62999 1
osd.123 0 0 0 3.62999 1
osd.124 49 18 18 3.62999 1
osd.125 0 0 0 3.62999 1
osd.126 0 0 0 3.62999 1
osd.127 39 18 18 3.62999 1
osd.128 0 0 0 3.62999 1
osd.129 38 17 17 3.62999 1
osd.130 49 22 22 3.62999 1
osd.131 0 0 0 3.62999 1
osd.132 47 15 15 3.62999 1
osd.133 0 0 0 3.62999 1
osd.134 31 12 12 3.62999 1
osd.135 0 0 0 3.62999 1
osd.136 40 18 18 3.62999 1
osd.137 0 0 0 3.62999 1
osd.138 31 15 15 3.62999 1
osd.139 0 0 0 3.62999 1
osd.140 34 20 20 3.62999 1
osd.141 0 0 0 3.62999 1
osd.142 40 10 10 3.62999 1
osd.143 0 0 0 3.62999 1
osd.144 44 19 19 3.62999 1
osd.145 0 0 0 3.62999 1
osd.146 38 21 21 3.62999 1
osd.147 0 0 0 3.62999 1
osd.148 40 14 14 3.62999 1
osd.149 0 0 0 3.62999 1
osd.150 41 18 18 3.62999 1
osd.151 0 0 0 3.62999 1
 in 151
 avg 33 stddev 16.7417 (0.507324x) (expected 5.741 0.17397x))
 min osd.12 28
 max osd.86 61
size 0 0
size 1 0
size 2 1134
size 3 914


Matt Conner
Keeper Technology


On Tue, Dec 8, 2015 at 5:35 AM, Ilya Dryomov <idryo...@gmail.com> wrote:
>
> O

[ceph-users] Unbalanced cluster

2015-03-03 Thread Matt Conner
Hi All,

I have a cluster that I've been pushing data into in order to get an idea
of how full it can get prior ceph marking the cluster full. Unfortunately,
each time I fill the cluster I end up with one disk that typically hits the
full ratio (0.95) while all other disks still have anywhere from 20-40%
free space (my latest attempt resulted in the cluster marking full at 60%
total usage). Any idea why the OSDs would be so unbalanced?

Few notes on the cluster:

   - It has 6 storage hosts with 143 total OSDs (typically 144 but it has
   one failed disk - removed from cluster)
   - All OSDs are 4TB drives
   - All OSDs are set to the same weight
   - The cluster is using host rules
   - Using ceph version 0.80.7


In terms of the Pool(s), I have been varying the number of pools from run
to run, following the PG calculator at http://ceph.com/pgcalc/ to determine
the number of placement groups. I have also attempted a few runs bumping up
the number of PGs, but it has only resulted in further unbalance.

Any thoughts?

Thanks,

Matt
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com