Re: [ceph-users] [EXTERNAL] Re: Increase PG number

2016-09-19 Thread David Turner
We regretably have to increase PG's in a ceph cluster this way more often than 
anyone should ever need to.  As such, we have scripted it out.  A basic version 
of the script that should work for you is below.

First, create a function to check for any pg states that you don't want to 
continue if any pgs are in them (better than duplicating code).  Second, set 
the flags so your cluster doesn't die while you do this.  Third, set your 
numbers of current PGs and the destination PGs for the for loop.  The Loop will 
ignore any number not divisible by 256.  As you've found, increasing by 256 is 
a good number.  More than that and you'll run into issues of your cluster 
curling into a fetal position and crying.  This will loop through increasing 
your pg_num, wait until everything is settled, then increase your pgp_num.  The 
seemingly excessive sleeps are to help the cluster be able to resolve blocked 
requests that will still happen during this.  Lastly unset the flags to let the 
cluster start moving the data around.

One thing to note, in a cluster with 800-1000 HDD OSDS with SSD journals, going 
from 16k to 32k PGs, We set maxbackfills to 1 during busy times and 2 during 
idle times.  maxbackfills of more than 2 is not beneficial for us to increasing 
our pg count.  We have tested maxbackfills of 2 and 5, both took the entire 
weekend to add 4k PGs.  We also do not add all of the PGs at once.  We do 4k 
each weekend and 2k during the week waiting for the cluster to finish each time 
to give our mon stores a chance to compact before we continue.



check_health(){
#If this finds any of the strings in the grep, then it will return 0, otherwise 
it will return 1 (or whatever the grep return code is)
ceph health | grep 'peering\|stale\|activating\|creating\|down' > /dev/null
return $?
}

for flag in nobackfill norecover noout nodown
do
ceph osd set $flag
done

#Set your current and destination pg counts here.
for num in {2048..16384}
do
[ $(( $i % 256 )) -eq 0 ] || continue
while sleep 10
do
check_health
if [ $? -ne 0 ]
then
#This assumes your pool is named rbd
ceph osd pool set rbd pg_num $num
break
fi
done
sleep 60
while sleep 10
do
check_health
if [ $? -ne 0 ]
then
#This assumes your pool is named rbd
ceph osd pool set rbd pgp_num $num
break
fi
done
sleep 60
done

for flag in nobackfill norecover noout nodown
do
ceph osd unset $flag
done







[cid:image156f02.JPG@3b99c4c4.47894373]<https://storagecraft.com>   David 
Turner | Cloud Operations Engineer | StorageCraft Technology 
Corporation<https://storagecraft.com>
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760 | Mobile: 385.224.2943



If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.




From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Matteo 
Dacrema [mdacr...@enter.eu]
Sent: Monday, September 19, 2016 2:51 AM
To: Will.Boege; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] [EXTERNAL] Re: Increase PG number

Hi,

I’ve 3 different cluster.
The first I’ve been able to upgrade from 1024 to 2048 pgs with 10 minutes of 
"io freeze”.
The second I’ve been able to upgrade from 368 to 512 in a sec without any 
performance issue, but from 512 to 1024 it take over 20 minutes to create pgs.
The third I’ve to upgrade is now 2048 pgs and I’ve to take it to 16384. So what 
I’m wondering is how to do it with minimum performance impact.

Maybe the best way is to upgrade by 256 to 256 pg and pgp num each time letting 
the cluster to rebalance every time.

Thanks
Matteo

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you have received this email in error please notify the system manager. This 
message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and delete 
this e-mail from your system. If you are not the intended recipient you are 
notified that disclosing, copying, distributing or taking any action in 
reliance on the contents of this information is strictly prohibited.

Il giorno 19 set 2016, alle ore 05:22, Will.Boege 
mailto:will.bo...@target.com>> ha scritto:

How many PGs do you have - and how many are you increasing it to?

Increasing PG counts can be d

Re: [ceph-users] [EXTERNAL] Re: Increase PG number

2016-09-19 Thread Matteo Dacrema
Hi,

I’ve 3 different cluster.
The first I’ve been able to upgrade from 1024 to 2048 pgs with 10 minutes of 
"io freeze”.
The second I’ve been able to upgrade from 368 to 512 in a sec without any 
performance issue, but from 512 to 1024 it take over 20 minutes to create pgs.
The third I’ve to upgrade is now 2048 pgs and I’ve to take it to 16384. So what 
I’m wondering is how to do it with minimum performance impact.

Maybe the best way is to upgrade by 256 to 256 pg and pgp num each time letting 
the cluster to rebalance every time.

Thanks
Matteo

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you have received this email in error please notify the system manager. This 
message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and delete 
this e-mail from your system. If you are not the intended recipient you are 
notified that disclosing, copying, distributing or taking any action in 
reliance on the contents of this information is strictly prohibited.

> Il giorno 19 set 2016, alle ore 05:22, Will.Boege  ha 
> scritto:
> 
> How many PGs do you have - and how many are you increasing it to? 
> 
> Increasing PG counts can be disruptive if you are increasing by a large 
> proportion of the initial count because all the PG peering involved.  If you 
> are doubling the amount of PGs it might be good to do it in stages to 
> minimize peering.  For example if you are going from 1024 to 2048 - consider 
> 4 increases of 256, allowing the cluster to stabilize in-between, rather that 
> one event that doubles the number of PGs. 
> 
> If you expect this cluster to grow, overshoot the recommended PG count by 50% 
> or so.  This will allow you to minimize the PG increase events, and thusly 
> impact to your users.  
> 
> From: ceph-users  > on behalf of Matteo Dacrema 
> mailto:mdacr...@enter.eu>>
> Date: Sunday, September 18, 2016 at 3:29 PM
> To: Goncalo Borges  >, "ceph-users@lists.ceph.com 
> "  >
> Subject: [EXTERNAL] Re: [ceph-users] Increase PG number
> 
> Hi , thanks for your reply.
> 
> Yes, I’don’t any near full osd.
> 
> The problem is not the rebalancing process but the process of creation of new 
> pgs.
> 
> I’ve only 2 host running Ceph Firefly version with 3 SSDs for journaling each.
> During the creation of new pgs all the volumes attached stop to read or write 
> showing high iowait.
> Ceph -s tell me that there are thousand of slow requests.
> 
> When all the pgs are created slow request begin to decrease and the cluster 
> start rebalancing process.
> 
> Matteo
> 
> This email and any files transmitted with it are confidential and intended 
> solely for the use of the individual or entity to whom they are addressed. If 
> you have received this email in error please notify the system manager. This 
> message contains confidential information and is intended only for the 
> individual named. If you are not the named addressee you should not 
> disseminate, distribute or copy this e-mail. Please notify the sender 
> immediately by e-mail if you have received this e-mail by mistake and delete 
> this e-mail from your system. If you are not the intended recipient you are 
> notified that disclosing, copying, distributing or taking any action in 
> reliance on the contents of this information is strictly prohibited.
> 
>> Il giorno 18 set 2016, alle ore 13:08, Goncalo Borges 
>> mailto:goncalo.bor...@sydney.edu.au>> ha 
>> scritto:
>> 
>> Hi
>> I am assuming that you do not have any near full osd  (either before or 
>> along the pg splitting process) and that your cluster is healthy. 
>> 
>> To minimize the impact on the clients during recover or operations like pg 
>> splitting, it is good to set the following configs. Obviously the whole 
>> operation will take longer to recover but the impact on clients will be 
>> minimized.
>> 
>> #  ceph daemon mon.rccephmon1 config show | egrep 
>> "(osd_max_backfills|osd_recovery_threads|osd_recovery_op_priority|osd_client_op_priority|osd_recovery_max_active)"
>>"osd_max_backfills": "1",
>>"osd_recovery_threads": "1",
>>"osd_recovery_max_active": "1"
>>"osd_client_op_priority": "63",
>>"osd_recovery_op_priority": "1"
>> 
>> Cheers
>> G.
>> 
>> From: ceph-users [ceph-users-boun...@lists.ceph.com 
>> ] on behalf of Matteo Dacrema 
>> [mdacr...@enter.eu ]
>> Sent: 18 September 2016 03:42
>> To: ceph-users@lists.ceph.com 
>> Subject: [ceph-u

Re: [ceph-users] [EXTERNAL] Re: Increase PG number

2016-09-18 Thread Will . Boege
How many PGs do you have - and how many are you increasing it to?

Increasing PG counts can be disruptive if you are increasing by a large 
proportion of the initial count because all the PG peering involved.  If you 
are doubling the amount of PGs it might be good to do it in stages to minimize 
peering.  For example if you are going from 1024 to 2048 - consider 4 increases 
of 256, allowing the cluster to stabilize in-between, rather that one event 
that doubles the number of PGs.

If you expect this cluster to grow, overshoot the recommended PG count by 50% 
or so.  This will allow you to minimize the PG increase events, and thusly 
impact to your users.

From: ceph-users 
mailto:ceph-users-boun...@lists.ceph.com>> 
on behalf of Matteo Dacrema mailto:mdacr...@enter.eu>>
Date: Sunday, September 18, 2016 at 3:29 PM
To: Goncalo Borges 
mailto:goncalo.bor...@sydney.edu.au>>, 
"ceph-users@lists.ceph.com" 
mailto:ceph-users@lists.ceph.com>>
Subject: [EXTERNAL] Re: [ceph-users] Increase PG number

Hi , thanks for your reply.

Yes, I’don’t any near full osd.

The problem is not the rebalancing process but the process of creation of new 
pgs.

I’ve only 2 host running Ceph Firefly version with 3 SSDs for journaling each.
During the creation of new pgs all the volumes attached stop to read or write 
showing high iowait.
Ceph -s tell me that there are thousand of slow requests.

When all the pgs are created slow request begin to decrease and the cluster 
start rebalancing process.

Matteo

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you have received this email in error please notify the system manager. This 
message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and delete 
this e-mail from your system. If you are not the intended recipient you are 
notified that disclosing, copying, distributing or taking any action in 
reliance on the contents of this information is strictly prohibited.

Il giorno 18 set 2016, alle ore 13:08, Goncalo Borges 
mailto:goncalo.bor...@sydney.edu.au>> ha scritto:

Hi
I am assuming that you do not have any near full osd  (either before or along 
the pg splitting process) and that your cluster is healthy.

To minimize the impact on the clients during recover or operations like pg 
splitting, it is good to set the following configs. Obviously the whole 
operation will take longer to recover but the impact on clients will be 
minimized.

#  ceph daemon mon.rccephmon1 config show | egrep 
"(osd_max_backfills|osd_recovery_threads|osd_recovery_op_priority|osd_client_op_priority|osd_recovery_max_active)"
   "osd_max_backfills": "1",
   "osd_recovery_threads": "1",
   "osd_recovery_max_active": "1"
   "osd_client_op_priority": "63",
   "osd_recovery_op_priority": "1"

Cheers
G.

From: ceph-users 
[ceph-users-boun...@lists.ceph.com] 
on behalf of Matteo Dacrema [mdacr...@enter.eu]
Sent: 18 September 2016 03:42
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Increase PG number

Hi All,

I need to expand my ceph cluster and I also need to increase pg number.
In a test environment I see that during pg creation all read and write 
operations are stopped.

Is that a normal behavior ?

Thanks
Matteo
This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you have received this email in error please notify the system manager. This 
message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and delete 
this e-mail from your system. If you are not the intended recipient you are 
notified that disclosing, copying, distributing or taking any action in 
reliance on the contents of this information is strictly prohibited.


--
Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto.
Seguire il link qui sotto per segnalarlo come spam:
http://mx01.enter.it/cgi-bin/learn-msg.cgi?id=D6CF2401EE.A1426



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com