Re: [ceph-users] Need help for PG problem

Goncalo Borges Wed, 23 Mar 2016 16:27:09 -0700

Hi Zhang...

I think you are dealing with two different problems.


The first problem refers to number of PGs per OSD. That was already discussed, 
and now there is no more messages concerning it.

The second problem you are experiencing seems to be that all your OSDs are 
under the same host. Besides that, osd.0 appears twice in two different hosts 
(I do not really know why is that happening). If you are using the default 
crush rules, ceph is not able to replicate objects (even with size 2) across 
two different hosts because all your OSDs are just in one host.

Cheers
Goncalo

________________________________
From: Zhang Qiang [[email protected]]
Sent: 23 March 2016 23:17
To: Goncalo Borges
Cc: Oliver Dzombic; ceph-users
Subject: Re: [ceph-users] Need help for PG problem

And here's the osd tree if it matters.

ID WEIGHT   TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 22.39984 root default
-2 21.39984     host 10
 0  1.06999         osd.0        up  1.00000          1.00000
 1  1.06999         osd.1        up  1.00000          1.00000
 2  1.06999         osd.2        up  1.00000          1.00000
 3  1.06999         osd.3        up  1.00000          1.00000
 4  1.06999         osd.4        up  1.00000          1.00000
 5  1.06999         osd.5        up  1.00000          1.00000
 6  1.06999         osd.6        up  1.00000          1.00000
 7  1.06999         osd.7        up  1.00000          1.00000
 8  1.06999         osd.8        up  1.00000          1.00000
 9  1.06999         osd.9        up  1.00000          1.00000
10  1.06999         osd.10       up  1.00000          1.00000
11  1.06999         osd.11       up  1.00000          1.00000
12  1.06999         osd.12       up  1.00000          1.00000
13  1.06999         osd.13       up  1.00000          1.00000
14  1.06999         osd.14       up  1.00000          1.00000
15  1.06999         osd.15       up  1.00000          1.00000
16  1.06999         osd.16       up  1.00000          1.00000
17  1.06999         osd.17       up  1.00000          1.00000
18  1.06999         osd.18       up  1.00000          1.00000
19  1.06999         osd.19       up  1.00000          1.00000
-3  1.00000     host 148_96
 0  1.00000         osd.0        up  1.00000          1.00000

On Wed, 23 Mar 2016 at 19:10 Zhang Qiang 
<[email protected]<redir.aspx?REF=7XhgTE6Jvg0jJH-IYNTGkgF858R1R8uarnbreTlxmaNI42sab1PTCAFtYWlsdG86ZG90c2xhc2gubHVAZ21haWwuY29t>>
 wrote:
Oliver, Goncalo,

Sorry to disturb again, but recreating the pool with a smaller pg_num didn't 
seem to work, now all 666 pgs are degraded + undersized.

New status:
    cluster d2a69513-ad8e-4b25-8f10-69c4041d624d
     health HEALTH_WARN
            666 pgs degraded
            82 pgs stuck unclean
            666 pgs undersized
     monmap e5: 5 mons at 
{1=10.3.138.37:6789/0,2=10.3.138.39:6789/0,3=10.3.138.40:6789/0,4=10.3.138.59:6789/0,GGZ-YG-S0311-PLATFORM-138=10.3.138.36:6789/0<redir.aspx?REF=eHahCJ6Vheno1kM9Y6hJVYyLtJjtbgztCcJvnwMZRopI42sab1PTCAFodHRwOi8vMTAuMy4xMzguMzc6Njc4OS8wLDI9MTAuMy4xMzguMzk6Njc4OS8wLDM9MTAuMy4xMzguNDA6Njc4OS8wLDQ9MTAuMy4xMzguNTk6Njc4OS8wLEdHWi1ZRy1TMDMxMS1QTEFURk9STS0xMzg9MTAuMy4xMzguMzY6Njc4OS8w>}
            election epoch 28, quorum 0,1,2,3,4 
GGZ-YG-S0311-PLATFORM-138,1,2,3,4
     osdmap e705: 20 osds: 20 up, 20 in
      pgmap v1961: 666 pgs, 1 pools, 0 bytes data, 0 objects
            13223 MB used, 20861 GB / 21991 GB avail
                 666 active+undersized+degraded

Only one pool and its size is 3. So I think according to the algorithm, (20 * 
100) / 3 = 666 pgs is reasonable.

I updated health detail and also attached a pg query result on 
gist(https://gist.github.com/dotSlashLu/22623b4cefa06a46e0d4<redir.aspx?REF=Re0O2_zDHLnX00Zf3IrX215GKBz2CkZCKo_yIyQwqm1I42sab1PTCAFodHRwczovL2dpc3QuZ2l0aHViLmNvbS9kb3RTbGFzaEx1LzIyNjIzYjRjZWZhMDZhNDZlMGQ0>).

On Wed, 23 Mar 2016 at 09:01 Dotslash Lu 
<[email protected]<redir.aspx?REF=7XhgTE6Jvg0jJH-IYNTGkgF858R1R8uarnbreTlxmaNI42sab1PTCAFtYWlsdG86ZG90c2xhc2gubHVAZ21haWwuY29t>>
 wrote:
Hello Gonçalo,

Thanks for your reminding. I was just setting up the cluster for test, so don't 
worry, I can just remove the pool. And I learnt that since the replication 
number and pool number are related to pg_num, I'll consider them carefully 
before deploying any data.

On Mar 23, 2016, at 6:58 AM, Goncalo Borges 
<[email protected]<redir.aspx?REF=D-0qwqHVfxBMkrPLhfESyhrrWqoVkS7gpwbQgeD2iipI42sab1PTCAFtYWlsdG86Z29uY2Fsby5ib3JnZXNAc3lkbmV5LmVkdS5hdQ..>>
 wrote:

Hi Zhang...

If I can add some more info, the change of PGs is a heavy operation, and as far 
as i know, you should NEVER decrease PGs. From the notes in pgcalc 
(http://ceph.com/pgcalc/<redir.aspx?REF=BmlSs9ubCaL3gFX-olWYBfZ00pCllP8f4VJ2yzfrmK9I42sab1PTCAFodHRwOi8vY2VwaC5jb20vcGdjYWxjLw..>):

"It's also important to know that the PG count can be increased, but NEVER 
decreased without destroying / recreating the pool. However, increasing the PG 
Count of a pool is one of the most impactful events in a Ceph Cluster, and 
should be avoided for production clusters if possible."

So, in your case, I would consider in adding more OSDs.

Cheers
Goncalo

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Need help for PG problem

Reply via email to