Re: [ceph-users] PGs per OSD guidance

2017-07-19 Thread David Turner
Here are a few thoughts. The more PGs, the higher memory requirement for
the osd process. If you are having problems with scrubs causing problems
with customer io, check some of the io priority settings that received a
big overhaul with Jewel and again with 10.2.9. The more PGs you have, the
smaller each one will be, so the scrubs will finish faster... But you'll
have that many more scrubs to do so it will end up taking the same amount
of time to scrub everything because you have the same amount of data.

Generally increasing PG counts is aimed at improving the distribution of
data between osds, maintaining a desired pg size when data growth is
happening (not a common concern), maintaining a desired amount of PGs per
osd when you add osds to your cluster. Outside of those reasons, I don't
know of any benefits to increasing PG counts.

On Wed, Jul 19, 2017, 9:57 PM Adrian Saul <adrian.s...@tpgtelecom.com.au>
wrote:

>
> Anyone able to offer any advice on this?
>
> Cheers,
>  Adrian
>
>
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> > Adrian Saul
> > Sent: Friday, 14 July 2017 6:05 PM
> > To: 'ceph-users@lists.ceph.com'
> > Subject: [ceph-users] PGs per OSD guidance
> >
> > Hi All,
> >I have been reviewing the sizing of our PGs with a view to some
> > intermittent performance issues.  When we have scrubs running, even when
> > only a few are, we can sometimes get severe impacts on the performance of
> > RBD images, enough to start causing VMs to appear stalled or
> unresponsive.
> > When some of these scrubs are running I can see very high latency on some
> > disks which I suspect is what is impacting the performance.  We currently
> > have around 70 PGs per SATA OSD, and 140 PGs per SSD OSD.   These
> > numbers are probably not really reflective as most of the data is in
> only really
> > half of the pools, so some PGs would be fairly heavy while others are
> > practically empty.   From what I have read we should be able to go
> > significantly higher though.We are running 10.2.1 if that matters in
> this
> > context.
> >
> >  My question is if we increase the numbers of PGs, is that likely to help
> > reduce the scrub impact or spread it wider?  For example, does the mere
> act
> > of scrubbing one PG mean the underlying disk is going to be hammered and
> > so we will impact more PGs with that load, or would having more PGs mean
> > the time to scrub the PG should be reduced and so the impact will be more
> > disbursed?
> >
> > I am also curious from a performance stand of view are we better off with
> > more PGs to reduce PG lock contention etc?
> >
> > Cheers,
> >  Adrian
> >
> >
> > Confidentiality: This email and any attachments are confidential and may
> be
> > subject to copyright, legal or some other professional privilege. They
> are
> > intended solely for the attention and use of the named addressee(s). They
> > may only be copied, distributed or disclosed with the consent of the
> > copyright owner. If you have received this email by mistake or by breach
> of
> > the confidentiality clause, please notify the sender immediately by
> return
> > email and delete or destroy all copies of the email. Any confidentiality,
> > privilege or copyright is not waived or lost because this email has been
> sent
> > to you by mistake.
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> Confidentiality: This email and any attachments are confidential and may
> be subject to copyright, legal or some other professional privilege. They
> are intended solely for the attention and use of the named addressee(s).
> They may only be copied, distributed or disclosed with the consent of the
> copyright owner. If you have received this email by mistake or by breach of
> the confidentiality clause, please notify the sender immediately by return
> email and delete or destroy all copies of the email. Any confidentiality,
> privilege or copyright is not waived or lost because this email has been
> sent to you by mistake.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PGs per OSD guidance

2017-07-19 Thread Adrian Saul

Anyone able to offer any advice on this?

Cheers,
 Adrian


> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Adrian Saul
> Sent: Friday, 14 July 2017 6:05 PM
> To: 'ceph-users@lists.ceph.com'
> Subject: [ceph-users] PGs per OSD guidance
>
> Hi All,
>I have been reviewing the sizing of our PGs with a view to some
> intermittent performance issues.  When we have scrubs running, even when
> only a few are, we can sometimes get severe impacts on the performance of
> RBD images, enough to start causing VMs to appear stalled or unresponsive.
> When some of these scrubs are running I can see very high latency on some
> disks which I suspect is what is impacting the performance.  We currently
> have around 70 PGs per SATA OSD, and 140 PGs per SSD OSD.   These
> numbers are probably not really reflective as most of the data is in only 
> really
> half of the pools, so some PGs would be fairly heavy while others are
> practically empty.   From what I have read we should be able to go
> significantly higher though.We are running 10.2.1 if that matters in this
> context.
>
>  My question is if we increase the numbers of PGs, is that likely to help
> reduce the scrub impact or spread it wider?  For example, does the mere act
> of scrubbing one PG mean the underlying disk is going to be hammered and
> so we will impact more PGs with that load, or would having more PGs mean
> the time to scrub the PG should be reduced and so the impact will be more
> disbursed?
>
> I am also curious from a performance stand of view are we better off with
> more PGs to reduce PG lock contention etc?
>
> Cheers,
>  Adrian
>
>
> Confidentiality: This email and any attachments are confidential and may be
> subject to copyright, legal or some other professional privilege. They are
> intended solely for the attention and use of the named addressee(s). They
> may only be copied, distributed or disclosed with the consent of the
> copyright owner. If you have received this email by mistake or by breach of
> the confidentiality clause, please notify the sender immediately by return
> email and delete or destroy all copies of the email. Any confidentiality,
> privilege or copyright is not waived or lost because this email has been sent
> to you by mistake.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Confidentiality: This email and any attachments are confidential and may be 
subject to copyright, legal or some other professional privilege. They are 
intended solely for the attention and use of the named addressee(s). They may 
only be copied, distributed or disclosed with the consent of the copyright 
owner. If you have received this email by mistake or by breach of the 
confidentiality clause, please notify the sender immediately by return email 
and delete or destroy all copies of the email. Any confidentiality, privilege 
or copyright is not waived or lost because this email has been sent to you by 
mistake.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] PGs per OSD guidance

2017-07-14 Thread Adrian Saul
Hi All,
   I have been reviewing the sizing of our PGs with a view to some intermittent 
performance issues.  When we have scrubs running, even when only a few are, we 
can sometimes get severe impacts on the performance of RBD images, enough to 
start causing VMs to appear stalled or unresponsive.When some of these 
scrubs are running I can see very high latency on some disks which I suspect is 
what is impacting the performance.  We currently have around 70 PGs per SATA 
OSD, and 140 PGs per SSD OSD.   These numbers are probably not really 
reflective as most of the data is in only really half of the pools, so some PGs 
would be fairly heavy while others are practically empty.   From what I have 
read we should be able to go significantly higher though.We are running 
10.2.1 if that matters in this context.

 My question is if we increase the numbers of PGs, is that likely to help 
reduce the scrub impact or spread it wider?  For example, does the mere act of 
scrubbing one PG mean the underlying disk is going to be hammered and so we 
will impact more PGs with that load, or would having more PGs mean the time to 
scrub the PG should be reduced and so the impact will be more disbursed?

I am also curious from a performance stand of view are we better off with more 
PGs to reduce PG lock contention etc?

Cheers,
 Adrian


Confidentiality: This email and any attachments are confidential and may be 
subject to copyright, legal or some other professional privilege. They are 
intended solely for the attention and use of the named addressee(s). They may 
only be copied, distributed or disclosed with the consent of the copyright 
owner. If you have received this email by mistake or by breach of the 
confidentiality clause, please notify the sender immediately by return email 
and delete or destroy all copies of the email. Any confidentiality, privilege 
or copyright is not waived or lost because this email has been sent to you by 
mistake.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com