Re: [ceph-users] PG Scaling

McNamara, Bradley Wed, 12 Mar 2014 16:02:21 -0700

Most things will cause data movement...

If you are going to have different failure zones within your crush map, I would 
edit your crush map and define those failure zones/buckets, first.  This will 
cause data movement when you inject the new crush map into the cluster.  This 
will immediately cause data movement.


Once the data movement from the new crush map is done, then I would change the 
number of placement groups.  This will immediately cause data movement, too.

If you have a cluster network defined and in use, this shouldn't materially 
affect the running cluster.  Response times may be exaggerated, but the cluster 
will be completely functional.

Brad

From: Karol Kozubal [mailto:[email protected]]
Sent: Wednesday, March 12, 2014 1:52 PM
To: McNamara, Bradley; [email protected]
Subject: Re: PG Scaling

Thank you for your response.

The number of replicas is already set to 3. So if I simply increase the number 
of pg's they will also start to move or is that simply triggered with size 
alterations? I suppose since this will generate movement in the cluster network 
it is ideal to do this operation while the cluster isnt as busy.

Karol


From: <McNamara>, Bradley 
<[email protected]<mailto:[email protected]>>
Date: Wednesday, March 12, 2014 at 1:54 PM
To: Karol Kozubal <[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: RE: PG Scaling

Round up your pg_num and pgp_num to the next power of 2, 2048.

Ceph will start moving data as soon as you implement the new 'size 3', so I 
would increase the pg_num and pgp_num, first, then increase the size.  It will 
start creating the new PG's immediately.  You can see all this going on using 
'ceph -w'.

Once the data is finished moving, you may need to  run 'ceph osd crush tunables 
optimal'.  This should take care of any unclean PG's that may be hanging around.

It is NOT possible to decrease the PG's.  One would need to  delete the pool 
and recreate it.

Brad

From: 
[email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Karol Kozubal
Sent: Wednesday, March 12, 2014 9:08 AM
To: [email protected]<mailto:[email protected]>
Subject: Re: [ceph-users] PG Scaling

Correction: Sorry min_size is at 1 everywhere.


Thank you.

Karol Kozubal

From: Karol Kozubal <[email protected]<mailto:[email protected]>>
Date: Wednesday, March 12, 2014 at 12:06 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: PG Scaling

Hi Everyone,

I am deploying an openstack deployment with Fuel 4.1 and have a 20 node ceph 
deployment of c6220's with 3 osd's and 1 journaling disk per node. When first 
deployed each storage pool is configured with the correct size and min_size 
attributes however fuel doesn't seem to apply the correct number of pg's to the 
pools based on the number of osd's that we actually have.

I make the adjustments using the following

(20 nodes * 3 OSDs)*100 / 3 replicas = 2000

ceph osd pool volumes set size 3
ceph osd pool volumes set min_size 3
ceph osd pool volumes set pg_num 2000
ceph osd pool volumes set pgp_num 2000

ceph osd pool images set size 3
ceph osd pool images set min_size 3
ceph osd pool images set pg_num 2000
ceph osd pool images set pgp_num 2000

ceph osd pool compute set size 3
ceph osd pool compute set min_size 3
ceph osd pool compute set pg_num 2000
ceph osd pool compute set pgp_num 2000

Here are the questions I am left with concerning these changes:

 1.  How long does it take for ceph to apply the changes and recalculate the 
pg's?
 2.  When is it safe to do this type of operation? before any data is written 
to the pools or is doing this while pools are used acceptable?
 3.  Is it possible to scale down the number of pg's ?
Thank you for your input.

Karol Kozubal

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] PG Scaling

Reply via email to