Re: [ceph-users] Need help for PG problem

2016-03-23 Thread Zhang Qiang
I adjusted the crush map, everything's OK now. Thanks for your help! On Wed, 23 Mar 2016 at 23:13 Matt Conner wrote: > Hi Zhang, > > In a 2 copy pool, each placement group is spread across 2 OSDs - that is > why you see such a high number of placement groups per OSD.

Re: [ceph-users] Need help for PG problem

2016-03-23 Thread Zhang Qiang
Goncalo > > -- > *From:* Zhang Qiang [dotslash...@gmail.com] > *Sent:* 23 March 2016 23:17 > *To:* Goncalo Borges > *Cc:* Oliver Dzombic; ceph-users > *Subject:* Re: [ceph-users] Need help for PG problem > > And here's the osd tree if it matters. &

Re: [ceph-users] Need help for PG problem

2016-03-23 Thread 施柏安
​It seems that you only have two host in your crush map. But the default ruleset would separate the object by host. If you set size 3 for pools, then there would be one object can't build ​because you only have two hosts. 2016-03-23 20:17 GMT+08:00 Zhang Qiang : > And

Re: [ceph-users] Need help for PG problem

2016-03-23 Thread Goncalo Borges
From: Zhang Qiang [dotslash...@gmail.com] Sent: 23 March 2016 23:17 To: Goncalo Borges Cc: Oliver Dzombic; ceph-users Subject: Re: [ceph-users] Need help for PG problem And here's the osd tree if it matters. ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1

Re: [ceph-users] Need help for PG problem

2016-03-23 Thread Matt Conner
Hi Zhang, In a 2 copy pool, each placement group is spread across 2 OSDs - that is why you see such a high number of placement groups per OSD. There is a PG calculator at http://ceph.com/pgcalc/. Based on your setup, it may be worth using 2048 instead of 4096. As for stuck/degraded PGs, most are

Re: [ceph-users] Need help for PG problem

2016-03-23 Thread koukou73gr
Are you runnig with the default failure domain of 'host'? If so, with a pool size of 3 and your 20 OSDs physically only on 2 hosts Ceph is unable to find a 3rd host to map the 3rd replica. Either add a host and move some OSDs there or reduce pool size to 2. -K. On 03/23/2016 02:17 PM, Zhang

Re: [ceph-users] Need help for PG problem

2016-03-23 Thread koukou73gr
You should have settled with the nearest power of 2, which for 666 is 512. Since you created the cluster and IIRC is a testbed, you may as well recreate it again, however it will less of a hassle to just increase the pgs to the next power of two: 1024 Your 20 ods appear to be equal sized in your

Re: [ceph-users] Need help for PG problem

2016-03-23 Thread Zhang Qiang
And here's the osd tree if it matters. ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 22.39984 root default -2 21.39984 host 10 0 1.06999 osd.0up 1.0 1.0 1 1.06999 osd.1up 1.0 1.0 2 1.06999

Re: [ceph-users] Need help for PG problem

2016-03-23 Thread Zhang Qiang
Oliver, Goncalo, Sorry to disturb again, but recreating the pool with a smaller pg_num didn't seem to work, now all 666 pgs are degraded + undersized. New status: cluster d2a69513-ad8e-4b25-8f10-69c4041d624d health HEALTH_WARN 666 pgs degraded 82 pgs stuck

Re: [ceph-users] Need help for PG problem

2016-03-22 Thread Dotslash Lu
Hello Gonçalo, Thanks for your reminding. I was just setting up the cluster for test, so don't worry, I can just remove the pool. And I learnt that since the replication number and pool number are related to pg_num, I'll consider them carefully before deploying any data. > On Mar 23, 2016,

Re: [ceph-users] Need help for PG problem

2016-03-22 Thread David Wang
Hi Zhang, From the ceph health detail, I suggest NTP server should be calibrated. Can you share crush map output? 2016-03-22 18:28 GMT+08:00 Zhang Qiang : > Hi Reddy, > It's over a thousand lines, I pasted it on gist: >

Re: [ceph-users] Need help for PG problem

2016-03-22 Thread Goncalo Borges
Hi Zhang... If I can add some more info, the change of PGs is a heavy operation, and as far as i know, you should NEVER decrease PGs. From the notes in pgcalc (http://ceph.com/pgcalc/): "It's also important to know that the PG count can be increased, but NEVER decreased without destroying /

Re: [ceph-users] Need help for PG problem

2016-03-22 Thread Zhang Qiang
I got it, the pg_num suggested is the total, I need to divide it by the number of replications. Thanks Oliver, your answer is very thorough and helpful! On 23 March 2016 at 02:19, Oliver Dzombic wrote: > Hi Zhang, > > yeah i saw your answer already. > > At very first,

[ceph-users] Need help for PG problem

2016-03-22 Thread Oliver Dzombic
Hi Zhang, yeah i saw your answer already. At very first, you should make sure that there is no clock skew. This can cause some sideeffects. According to http://docs.ceph.com/docs/master/rados/operations/placement-groups/ you have to: (OSDs * 100) Total PGs =

Re: [ceph-users] Need help for PG problem

2016-03-22 Thread Zhang Qiang
Hi Reddy, It's over a thousand lines, I pasted it on gist: https://gist.github.com/dotSlashLu/22623b4cefa06a46e0d4 On Tue, 22 Mar 2016 at 18:15 M Ranga Swami Reddy wrote: > Hi, > Can you please share the "ceph health detail" output? > > Thanks > Swami > > On Tue, Mar 22,

Re: [ceph-users] Need help for PG problem

2016-03-22 Thread Oliver Dzombic
Hi Zhang, are you sure, that all your 20 osd's are up and in ? Please provide the complete output of ceph -s or better with detail flag. Thank you :-) -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive mailto:i...@ip-interactive.de Anschrift: IP Interactive UG (

Re: [ceph-users] Need help for PG problem

2016-03-22 Thread M Ranga Swami Reddy
Hi, Can you please share the "ceph health detail" output? Thanks Swami On Tue, Mar 22, 2016 at 3:32 PM, Zhang Qiang wrote: > Hi all, > > I have 20 OSDs and 1 pool, and, as recommended by the > doc(http://docs.ceph.com/docs/master/rados/operations/placement-groups/), I >

[ceph-users] Need help for PG problem

2016-03-22 Thread Zhang Qiang
Hi all, I have 20 OSDs and 1 pool, and, as recommended by the doc( http://docs.ceph.com/docs/master/rados/operations/placement-groups/), I configured pg_num and pgp_num to 4096, size 2, min size 1. But ceph -s shows: HEALTH_WARN 534 pgs degraded 551 pgs stuck unclean 534 pgs undersized too many