Yes it was the crush map. I updated it, distributed 20 OSDs across 2 hosts correctly, finally all pgs are healthy.
Thanks guys, I really appreciate your help! On Thu, 24 Mar 2016 at 07:25 Goncalo Borges <goncalo.bor...@sydney.edu.au> wrote: > Hi Zhang... > > I think you are dealing with two different problems. > > The first problem refers to number of PGs per OSD. That was already > discussed, and now there is no more messages concerning it. > > The second problem you are experiencing seems to be that all your OSDs are > under the same host. Besides that, osd.0 appears twice in two different > hosts (I do not really know why is that happening). If you are using the > default crush rules, ceph is not able to replicate objects (even with size > 2) across two different hosts because all your OSDs are just in one host. > > Cheers > Goncalo > > ------------------------------ > *From:* Zhang Qiang [dotslash...@gmail.com] > *Sent:* 23 March 2016 23:17 > *To:* Goncalo Borges > *Cc:* Oliver Dzombic; ceph-users > *Subject:* Re: [ceph-users] Need help for PG problem > > And here's the osd tree if it matters. > > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > -1 22.39984 root default > -2 21.39984 host 10 > 0 1.06999 osd.0 up 1.00000 1.00000 > 1 1.06999 osd.1 up 1.00000 1.00000 > 2 1.06999 osd.2 up 1.00000 1.00000 > 3 1.06999 osd.3 up 1.00000 1.00000 > 4 1.06999 osd.4 up 1.00000 1.00000 > 5 1.06999 osd.5 up 1.00000 1.00000 > 6 1.06999 osd.6 up 1.00000 1.00000 > 7 1.06999 osd.7 up 1.00000 1.00000 > 8 1.06999 osd.8 up 1.00000 1.00000 > 9 1.06999 osd.9 up 1.00000 1.00000 > 10 1.06999 osd.10 up 1.00000 1.00000 > 11 1.06999 osd.11 up 1.00000 1.00000 > 12 1.06999 osd.12 up 1.00000 1.00000 > 13 1.06999 osd.13 up 1.00000 1.00000 > 14 1.06999 osd.14 up 1.00000 1.00000 > 15 1.06999 osd.15 up 1.00000 1.00000 > 16 1.06999 osd.16 up 1.00000 1.00000 > 17 1.06999 osd.17 up 1.00000 1.00000 > 18 1.06999 osd.18 up 1.00000 1.00000 > 19 1.06999 osd.19 up 1.00000 1.00000 > -3 1.00000 host 148_96 > 0 1.00000 osd.0 up 1.00000 1.00000 > > On Wed, 23 Mar 2016 at 19:10 Zhang Qiang <dotslash...@gmail.com > <http://redir.aspx?REF=7XhgTE6Jvg0jJH-IYNTGkgF858R1R8uarnbreTlxmaNI42sab1PTCAFtYWlsdG86ZG90c2xhc2gubHVAZ21haWwuY29t>> > wrote: > >> Oliver, Goncalo, >> >> Sorry to disturb again, but recreating the pool with a smaller pg_num >> didn't seem to work, now all 666 pgs are degraded + undersized. >> >> New status: >> cluster d2a69513-ad8e-4b25-8f10-69c4041d624d >> health HEALTH_WARN >> 666 pgs degraded >> 82 pgs stuck unclean >> 666 pgs undersized >> monmap e5: 5 mons at {1= >> 10.3.138.37:6789/0,2=10.3.138.39:6789/0,3=10.3.138.40:6789/0,4=10.3.138.59:6789/0,GGZ-YG-S0311-PLATFORM-138=10.3.138.36:6789/0 >> <http://redir.aspx?REF=eHahCJ6Vheno1kM9Y6hJVYyLtJjtbgztCcJvnwMZRopI42sab1PTCAFodHRwOi8vMTAuMy4xMzguMzc6Njc4OS8wLDI9MTAuMy4xMzguMzk6Njc4OS8wLDM9MTAuMy4xMzguNDA6Njc4OS8wLDQ9MTAuMy4xMzguNTk6Njc4OS8wLEdHWi1ZRy1TMDMxMS1QTEFURk9STS0xMzg9MTAuMy4xMzguMzY6Njc4OS8w> >> } >> election epoch 28, quorum 0,1,2,3,4 >> GGZ-YG-S0311-PLATFORM-138,1,2,3,4 >> osdmap e705: 20 osds: 20 up, 20 in >> pgmap v1961: 666 pgs, 1 pools, 0 bytes data, 0 objects >> 13223 MB used, 20861 GB / 21991 GB avail >> 666 active+undersized+degraded >> >> Only one pool and its size is 3. So I think according to the algorithm, >> (20 * 100) / 3 = 666 pgs is reasonable. >> >> I updated health detail and also attached a pg query result on gist( >> https://gist.github.com/dotSlashLu/22623b4cefa06a46e0d4 >> <http://redir.aspx?REF=Re0O2_zDHLnX00Zf3IrX215GKBz2CkZCKo_yIyQwqm1I42sab1PTCAFodHRwczovL2dpc3QuZ2l0aHViLmNvbS9kb3RTbGFzaEx1LzIyNjIzYjRjZWZhMDZhNDZlMGQ0> >> ). >> >> On Wed, 23 Mar 2016 at 09:01 Dotslash Lu <dotslash...@gmail.com >> <http://redir.aspx?REF=7XhgTE6Jvg0jJH-IYNTGkgF858R1R8uarnbreTlxmaNI42sab1PTCAFtYWlsdG86ZG90c2xhc2gubHVAZ21haWwuY29t>> >> wrote: >> >>> Hello Gonçalo, >>> >>> Thanks for your reminding. I was just setting up the cluster for test, >>> so don't worry, I can just remove the pool. And I learnt that since the >>> replication number and pool number are related to pg_num, I'll consider >>> them carefully before deploying any data. >>> >>> On Mar 23, 2016, at 6:58 AM, Goncalo Borges < >>> goncalo.bor...@sydney.edu.au >>> <http://redir.aspx?REF=D-0qwqHVfxBMkrPLhfESyhrrWqoVkS7gpwbQgeD2iipI42sab1PTCAFtYWlsdG86Z29uY2Fsby5ib3JnZXNAc3lkbmV5LmVkdS5hdQ..>> >>> wrote: >>> >>> Hi Zhang... >>> >>> If I can add some more info, the change of PGs is a heavy operation, and >>> as far as i know, you should NEVER decrease PGs. From the notes in pgcalc ( >>> http://ceph.com/pgcalc/ >>> <http://redir.aspx?REF=BmlSs9ubCaL3gFX-olWYBfZ00pCllP8f4VJ2yzfrmK9I42sab1PTCAFodHRwOi8vY2VwaC5jb20vcGdjYWxjLw..> >>> ): >>> >>> "It's also important to know that the PG count can be increased, but >>> NEVER decreased without destroying / recreating the pool. However, >>> increasing the PG Count of a pool is one of the most impactful events in a >>> Ceph Cluster, and should be avoided for production clusters if possible." >>> >>> So, in your case, I would consider in adding more OSDs. >>> >>> Cheers >>> Goncalo >>> >>>
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com