Re: [ceph-users] deep-scrubbing has large impact on performance

2016-11-23 Thread Nick Fisk
Actually this might suggest that caution should be taken before enabling this 
at the moment

http://tracker.ceph.com/issues/15774


> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Nick 
> Fisk
> Sent: 23 November 2016 11:17
> To: 'Robert LeBlanc' <rob...@leblancnet.us>; 'Eugen Block' <ebl...@nde.ag>
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] deep-scrubbing has large impact on performance
> 
> Thanks for the tip Robert, much appreciated.
> 
> > -Original Message-
> > From: Robert LeBlanc [mailto:rob...@leblancnet.us]
> > Sent: 23 November 2016 00:54
> > To: Eugen Block <ebl...@nde.ag>
> > Cc: Nick Fisk <n...@fisk.me.uk>; ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] deep-scrubbing has large impact on
> > performance
> >
> > If you use wpq, I recommend also setting "osd_op_queue_cut_off = high"
> > as well, otherwise replication OPs are not weighted and really reduces the 
> > benefit of wpq.
> > 
> > Robert LeBlanc
> > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> >
> >
> > On Tue, Nov 22, 2016 at 5:34 AM, Eugen Block <ebl...@nde.ag> wrote:
> > > Thank you!
> > >
> > >
> > > Zitat von Nick Fisk <n...@fisk.me.uk>:
> > >
> > >>> -Original Message-----
> > >>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > >>> Behalf Of Eugen Block
> > >>> Sent: 22 November 2016 10:11
> > >>> To: Nick Fisk <n...@fisk.me.uk>
> > >>> Cc: ceph-users@lists.ceph.com
> > >>> Subject: Re: [ceph-users] deep-scrubbing has large impact on
> > >>> performance
> > >>>
> > >>> Thanks for the very quick answer!
> > >>>
> > >>> > If you are using Jewel
> > >>>
> > >>> We are still using Hammer (0.94.7), we wanted to upgrade to Jewel
> > >>> in a couple of weeks, would you recommend to do it now?
> > >>
> > >>
> > >> It's been fairly solid for me, but you might want to wait for the
> > >> scrubbing hang bug to be fixed before upgrading. I think this might
> > >> be fixed in the upcoming 10.2.4 release.
> > >>
> > >>>
> > >>>
> > >>> Zitat von Nick Fisk <n...@fisk.me.uk>:
> > >>>
> > >>> >> -Original Message-
> > >>> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > >>> >> Behalf Of Eugen Block
> > >>> >> Sent: 22 November 2016 09:55
> > >>> >> To: ceph-users@lists.ceph.com
> > >>> >> Subject: [ceph-users] deep-scrubbing has large impact on
> > >>> >> performance
> > >>> >>
> > >>> >> Hi list,
> > >>> >>
> > >>> >> I've been searching the mail archive and the web for some help.
> > >>> >> I tried the things I found, but I can't see the effects. We use
> > >>> > Ceph for
> > >>> >> our Openstack environment.
> > >>> >>
> > >>> >> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4
> > >>> >> nodes,
> > >>> >> 3
> > >>> >> MONs) starts deep-scrubbing, it's impossible to work with the VMs.
> > >>> >> Currently, the deep-scrubs happen to start on Monday, which is
> > >>> >> unfortunate. I already plan to start the next deep-scrub on
> > >>> > Saturday,
> > >>> >> so it has no impact on our work days. But if I imagine we had a
> > >>> >> large multi-datacenter, such performance breaks are not
> > >>> > reasonable. So
> > >>> >> I'm wondering how do you guys manage that?
> > >>> >>
> > >>> >> What I've tried so far:
> > >>> >>
> > >>> >> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1'
> > >>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7'
> > >>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle'
> > >>> >> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0'
> > >>> >> ceph tell osd.* injectargs '--osd_scrub_end_hour 7'
&g

Re: [ceph-users] deep-scrubbing has large impact on performance

2016-11-23 Thread Nick Fisk
Thanks for the tip Robert, much appreciated.

> -Original Message-
> From: Robert LeBlanc [mailto:rob...@leblancnet.us]
> Sent: 23 November 2016 00:54
> To: Eugen Block <ebl...@nde.ag>
> Cc: Nick Fisk <n...@fisk.me.uk>; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] deep-scrubbing has large impact on performance
> 
> If you use wpq, I recommend also setting "osd_op_queue_cut_off = high"
> as well, otherwise replication OPs are not weighted and really reduces the 
> benefit of wpq.
> 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> 
> 
> On Tue, Nov 22, 2016 at 5:34 AM, Eugen Block <ebl...@nde.ag> wrote:
> > Thank you!
> >
> >
> > Zitat von Nick Fisk <n...@fisk.me.uk>:
> >
> >>> -Original Message-
> >>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> >>> Behalf Of Eugen Block
> >>> Sent: 22 November 2016 10:11
> >>> To: Nick Fisk <n...@fisk.me.uk>
> >>> Cc: ceph-users@lists.ceph.com
> >>> Subject: Re: [ceph-users] deep-scrubbing has large impact on
> >>> performance
> >>>
> >>> Thanks for the very quick answer!
> >>>
> >>> > If you are using Jewel
> >>>
> >>> We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in
> >>> a couple of weeks, would you recommend to do it now?
> >>
> >>
> >> It's been fairly solid for me, but you might want to wait for the
> >> scrubbing hang bug to be fixed before upgrading. I think this might
> >> be fixed in the upcoming 10.2.4 release.
> >>
> >>>
> >>>
> >>> Zitat von Nick Fisk <n...@fisk.me.uk>:
> >>>
> >>> >> -Original Message-
> >>> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> >>> >> Behalf Of Eugen Block
> >>> >> Sent: 22 November 2016 09:55
> >>> >> To: ceph-users@lists.ceph.com
> >>> >> Subject: [ceph-users] deep-scrubbing has large impact on
> >>> >> performance
> >>> >>
> >>> >> Hi list,
> >>> >>
> >>> >> I've been searching the mail archive and the web for some help. I
> >>> >> tried the things I found, but I can't see the effects. We use
> >>> > Ceph for
> >>> >> our Openstack environment.
> >>> >>
> >>> >> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes,
> >>> >> 3
> >>> >> MONs) starts deep-scrubbing, it's impossible to work with the VMs.
> >>> >> Currently, the deep-scrubs happen to start on Monday, which is
> >>> >> unfortunate. I already plan to start the next deep-scrub on
> >>> > Saturday,
> >>> >> so it has no impact on our work days. But if I imagine we had a
> >>> >> large multi-datacenter, such performance breaks are not
> >>> > reasonable. So
> >>> >> I'm wondering how do you guys manage that?
> >>> >>
> >>> >> What I've tried so far:
> >>> >>
> >>> >> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1'
> >>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7'
> >>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle'
> >>> >> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0'
> >>> >> ceph tell osd.* injectargs '--osd_scrub_end_hour 7'
> >>> >>
> >>> >> And I also added these options to the ceph.conf.
> >>> >> To be able to work again, I had to set the nodeep-scrub option
> >>> >> and unset it when I left the office. Today, I see the cluster
> >>> >> deep- scrubbing again, but only one PG at a time, it seems that
> >>> >> now the default for osd_max_scrubs is working now and I don't see
> >>> >> major impacts yet.
> >>> >>
> >>> >> But is there something else I can do to reduce the performance impact?
> >>> >
> >>> > If you are using Jewel, the scrubing is now done in the client IO
> >>> > thread, so those disk thread options won't do anything. Instead
> >>> > there is a new priority setting, which seems to work for me, along
> >>> > w

Re: [ceph-users] deep-scrubbing has large impact on performance

2016-11-22 Thread Robert LeBlanc
If you use wpq, I recommend also setting "osd_op_queue_cut_off = high"
as well, otherwise replication OPs are not weighted and really reduces
the benefit of wpq.

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Tue, Nov 22, 2016 at 5:34 AM, Eugen Block <ebl...@nde.ag> wrote:
> Thank you!
>
>
> Zitat von Nick Fisk <n...@fisk.me.uk>:
>
>>> -Original Message-
>>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>>> Eugen Block
>>> Sent: 22 November 2016 10:11
>>> To: Nick Fisk <n...@fisk.me.uk>
>>> Cc: ceph-users@lists.ceph.com
>>> Subject: Re: [ceph-users] deep-scrubbing has large impact on performance
>>>
>>> Thanks for the very quick answer!
>>>
>>> > If you are using Jewel
>>>
>>> We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in a
>>> couple of weeks, would you recommend to do it now?
>>
>>
>> It's been fairly solid for me, but you might want to wait for the
>> scrubbing hang bug to be fixed before upgrading. I think this
>> might be fixed in the upcoming 10.2.4 release.
>>
>>>
>>>
>>> Zitat von Nick Fisk <n...@fisk.me.uk>:
>>>
>>> >> -Original Message-
>>> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
>>> >> Of Eugen Block
>>> >> Sent: 22 November 2016 09:55
>>> >> To: ceph-users@lists.ceph.com
>>> >> Subject: [ceph-users] deep-scrubbing has large impact on performance
>>> >>
>>> >> Hi list,
>>> >>
>>> >> I've been searching the mail archive and the web for some help. I
>>> >> tried the things I found, but I can't see the effects. We use
>>> > Ceph for
>>> >> our Openstack environment.
>>> >>
>>> >> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3
>>> >> MONs) starts deep-scrubbing, it's impossible to work with the VMs.
>>> >> Currently, the deep-scrubs happen to start on Monday, which is
>>> >> unfortunate. I already plan to start the next deep-scrub on
>>> > Saturday,
>>> >> so it has no impact on our work days. But if I imagine we had a large
>>> >> multi-datacenter, such performance breaks are not
>>> > reasonable. So
>>> >> I'm wondering how do you guys manage that?
>>> >>
>>> >> What I've tried so far:
>>> >>
>>> >> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1'
>>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7'
>>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle'
>>> >> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0'
>>> >> ceph tell osd.* injectargs '--osd_scrub_end_hour 7'
>>> >>
>>> >> And I also added these options to the ceph.conf.
>>> >> To be able to work again, I had to set the nodeep-scrub option and
>>> >> unset it when I left the office. Today, I see the cluster deep-
>>> >> scrubbing again, but only one PG at a time, it seems that now the
>>> >> default for osd_max_scrubs is working now and I don't see major
>>> >> impacts yet.
>>> >>
>>> >> But is there something else I can do to reduce the performance impact?
>>> >
>>> > If you are using Jewel, the scrubing is now done in the client IO
>>> > thread, so those disk thread options won't do anything. Instead there
>>> > is a new priority setting, which seems to work for me, along with a
>>> > few other settings.
>>> >
>>> > osd_scrub_priority = 1
>>> > osd_scrub_sleep = .1
>>> > osd_scrub_chunk_min = 1
>>> > osd_scrub_chunk_max = 5
>>> > osd_scrub_load_threshold = 5
>>> >
>>> > Also enabling the weighted priority queue can assist the new priority
>>> > options
>>> >
>>> > osd_op_queue = wpq
>>> >
>>> >
>>> >> I just found [1] and will have a look into it.
>>> >>
>>> >> [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/
>>> >>
>>> >> Thanks!
>>> >> Eugen
>>> >>
>>> >> --
>>> >> Eugen Block voice   : +49-40-559 51 75
>>> >&g

Re: [ceph-users] deep-scrubbing has large impact on performance

2016-11-22 Thread Eugen Block

Thank you!


Zitat von Nick Fisk <n...@fisk.me.uk>:


-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On  
Behalf Of Eugen Block

Sent: 22 November 2016 10:11
To: Nick Fisk <n...@fisk.me.uk>
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] deep-scrubbing has large impact on performance

Thanks for the very quick answer!

> If you are using Jewel

We are still using Hammer (0.94.7), we wanted to upgrade to Jewel  
in a couple of weeks, would you recommend to do it now?


It's been fairly solid for me, but you might want to wait for the  
scrubbing hang bug to be fixed before upgrading. I think this

might be fixed in the upcoming 10.2.4 release.




Zitat von Nick Fisk <n...@fisk.me.uk>:

>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
>> Of Eugen Block
>> Sent: 22 November 2016 09:55
>> To: ceph-users@lists.ceph.com
>> Subject: [ceph-users] deep-scrubbing has large impact on performance
>>
>> Hi list,
>>
>> I've been searching the mail archive and the web for some help. I
>> tried the things I found, but I can't see the effects. We use
> Ceph for
>> our Openstack environment.
>>
>> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3
>> MONs) starts deep-scrubbing, it's impossible to work with the VMs.
>> Currently, the deep-scrubs happen to start on Monday, which is
>> unfortunate. I already plan to start the next deep-scrub on
> Saturday,
>> so it has no impact on our work days. But if I imagine we had a large
>> multi-datacenter, such performance breaks are not
> reasonable. So
>> I'm wondering how do you guys manage that?
>>
>> What I've tried so far:
>>
>> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1'
>> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7'
>> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle'
>> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0'
>> ceph tell osd.* injectargs '--osd_scrub_end_hour 7'
>>
>> And I also added these options to the ceph.conf.
>> To be able to work again, I had to set the nodeep-scrub option and
>> unset it when I left the office. Today, I see the cluster deep-
>> scrubbing again, but only one PG at a time, it seems that now the
>> default for osd_max_scrubs is working now and I don't see major
>> impacts yet.
>>
>> But is there something else I can do to reduce the performance impact?
>
> If you are using Jewel, the scrubing is now done in the client IO
> thread, so those disk thread options won't do anything. Instead there
> is a new priority setting, which seems to work for me, along with a
> few other settings.
>
> osd_scrub_priority = 1
> osd_scrub_sleep = .1
> osd_scrub_chunk_min = 1
> osd_scrub_chunk_max = 5
> osd_scrub_load_threshold = 5
>
> Also enabling the weighted priority queue can assist the new priority
> options
>
> osd_op_queue = wpq
>
>
>> I just found [1] and will have a look into it.
>>
>> [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/
>>
>> Thanks!
>> Eugen
>>
>> --
>> Eugen Block voice   : +49-40-559 51 75
>> NDE Netzdesign und -entwicklung AG  fax : +49-40-559 51 77
>> Postfach 61 03 15
>> D-22423 Hamburg e-mail  : ebl...@nde.ag
>>
>>  Vorsitzende des Aufsichtsrates: Angelika Mozdzen
>>Sitz und Registergericht: Hamburg, HRB 90934
>>Vorstand: Jens-U. Mozdzen
>> USt-IdNr. DE 814 013 983
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Eugen Block voice   : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG  fax : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg e-mail  : ebl...@nde.ag

 Vorsitzende des Aufsichtsrates: Angelika Mozdzen
   Sitz und Registergericht: Hamburg, HRB 90934
   Vorstand: Jens-U. Mozdzen
USt-IdNr. DE 814 013 983

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Eugen Block voice   : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG  fax : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg e-mail  : ebl...@nde.ag

Vorsitzende des Aufsichtsrates: Angelika Mozdzen
  Sitz und Registergericht: Hamburg, HRB 90934
  Vorstand: Jens-U. Mozdzen
   USt-IdNr. DE 814 013 983

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] deep-scrubbing has large impact on performance

2016-11-22 Thread Nick Fisk
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
> Eugen Block
> Sent: 22 November 2016 10:11
> To: Nick Fisk <n...@fisk.me.uk>
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] deep-scrubbing has large impact on performance
> 
> Thanks for the very quick answer!
> 
> > If you are using Jewel
> 
> We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in a couple 
> of weeks, would you recommend to do it now?

It's been fairly solid for me, but you might want to wait for the scrubbing 
hang bug to be fixed before upgrading. I think this
might be fixed in the upcoming 10.2.4 release.

> 
> 
> Zitat von Nick Fisk <n...@fisk.me.uk>:
> 
> >> -Original Message-
> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> >> Of Eugen Block
> >> Sent: 22 November 2016 09:55
> >> To: ceph-users@lists.ceph.com
> >> Subject: [ceph-users] deep-scrubbing has large impact on performance
> >>
> >> Hi list,
> >>
> >> I've been searching the mail archive and the web for some help. I
> >> tried the things I found, but I can't see the effects. We use
> > Ceph for
> >> our Openstack environment.
> >>
> >> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3
> >> MONs) starts deep-scrubbing, it's impossible to work with the VMs.
> >> Currently, the deep-scrubs happen to start on Monday, which is
> >> unfortunate. I already plan to start the next deep-scrub on
> > Saturday,
> >> so it has no impact on our work days. But if I imagine we had a large
> >> multi-datacenter, such performance breaks are not
> > reasonable. So
> >> I'm wondering how do you guys manage that?
> >>
> >> What I've tried so far:
> >>
> >> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1'
> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7'
> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle'
> >> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0'
> >> ceph tell osd.* injectargs '--osd_scrub_end_hour 7'
> >>
> >> And I also added these options to the ceph.conf.
> >> To be able to work again, I had to set the nodeep-scrub option and
> >> unset it when I left the office. Today, I see the cluster deep-
> >> scrubbing again, but only one PG at a time, it seems that now the
> >> default for osd_max_scrubs is working now and I don't see major
> >> impacts yet.
> >>
> >> But is there something else I can do to reduce the performance impact?
> >
> > If you are using Jewel, the scrubing is now done in the client IO
> > thread, so those disk thread options won't do anything. Instead there
> > is a new priority setting, which seems to work for me, along with a
> > few other settings.
> >
> > osd_scrub_priority = 1
> > osd_scrub_sleep = .1
> > osd_scrub_chunk_min = 1
> > osd_scrub_chunk_max = 5
> > osd_scrub_load_threshold = 5
> >
> > Also enabling the weighted priority queue can assist the new priority
> > options
> >
> > osd_op_queue = wpq
> >
> >
> >> I just found [1] and will have a look into it.
> >>
> >> [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/
> >>
> >> Thanks!
> >> Eugen
> >>
> >> --
> >> Eugen Block voice   : +49-40-559 51 75
> >> NDE Netzdesign und -entwicklung AG  fax : +49-40-559 51 77
> >> Postfach 61 03 15
> >> D-22423 Hamburg e-mail  : ebl...@nde.ag
> >>
> >>  Vorsitzende des Aufsichtsrates: Angelika Mozdzen
> >>Sitz und Registergericht: Hamburg, HRB 90934
> >>Vorstand: Jens-U. Mozdzen
> >> USt-IdNr. DE 814 013 983
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> --
> Eugen Block voice   : +49-40-559 51 75
> NDE Netzdesign und -entwicklung AG  fax : +49-40-559 51 77
> Postfach 61 03 15
> D-22423 Hamburg e-mail  : ebl...@nde.ag
> 
>  Vorsitzende des Aufsichtsrates: Angelika Mozdzen
>Sitz und Registergericht: Hamburg, HRB 90934
>Vorstand: Jens-U. Mozdzen
> USt-IdNr. DE 814 013 983
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] deep-scrubbing has large impact on performance

2016-11-22 Thread Eugen Block

Thanks for the very quick answer!


If you are using Jewel


We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in a  
couple of weeks, would you recommend to do it now?



Zitat von Nick Fisk :


-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On  
Behalf Of Eugen Block

Sent: 22 November 2016 09:55
To: ceph-users@lists.ceph.com
Subject: [ceph-users] deep-scrubbing has large impact on performance

Hi list,

I've been searching the mail archive and the web for some help. I  
tried the things I found, but I can't see the effects. We use

Ceph for

our Openstack environment.

When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3
MONs) starts deep-scrubbing, it's impossible to work with the VMs.
Currently, the deep-scrubs happen to start on Monday, which is  
unfortunate. I already plan to start the next deep-scrub on

Saturday,
so it has no impact on our work days. But if I imagine we had a  
large multi-datacenter, such performance breaks are not

reasonable. So

I'm wondering how do you guys manage that?

What I've tried so far:

ceph tell osd.* injectargs '--osd_scrub_sleep 0.1'
ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7'
ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle'
ceph tell osd.* injectargs '--osd_scrub_begin_hour 0'
ceph tell osd.* injectargs '--osd_scrub_end_hour 7'

And I also added these options to the ceph.conf.
To be able to work again, I had to set the nodeep-scrub option and  
unset it when I left the office. Today, I see the cluster deep-
scrubbing again, but only one PG at a time, it seems that now the  
default for osd_max_scrubs is working now and I don't see major

impacts yet.

But is there something else I can do to reduce the performance impact?


If you are using Jewel, the scrubing is now done in the client IO  
thread, so those disk thread options won't do anything. Instead
there is a new priority setting, which seems to work for me, along  
with a few other settings.


osd_scrub_priority = 1
osd_scrub_sleep = .1
osd_scrub_chunk_min = 1
osd_scrub_chunk_max = 5
osd_scrub_load_threshold = 5

Also enabling the weighted priority queue can assist the new priority options

osd_op_queue = wpq



I just found [1] and will have a look into it.

[1] http://prob6.com/en/ceph-pg-deep-scrub-cron/

Thanks!
Eugen

--
Eugen Block voice   : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG  fax : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg e-mail  : ebl...@nde.ag

 Vorsitzende des Aufsichtsrates: Angelika Mozdzen
   Sitz und Registergericht: Hamburg, HRB 90934
   Vorstand: Jens-U. Mozdzen
USt-IdNr. DE 814 013 983

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Eugen Block voice   : +49-40-559 51 75
NDE Netzdesign und -entwicklung AG  fax : +49-40-559 51 77
Postfach 61 03 15
D-22423 Hamburg e-mail  : ebl...@nde.ag

Vorsitzende des Aufsichtsrates: Angelika Mozdzen
  Sitz und Registergericht: Hamburg, HRB 90934
  Vorstand: Jens-U. Mozdzen
   USt-IdNr. DE 814 013 983

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] deep-scrubbing has large impact on performance

2016-11-22 Thread Nick Fisk
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
> Eugen Block
> Sent: 22 November 2016 09:55
> To: ceph-users@lists.ceph.com
> Subject: [ceph-users] deep-scrubbing has large impact on performance
> 
> Hi list,
> 
> I've been searching the mail archive and the web for some help. I tried the 
> things I found, but I can't see the effects. We use
Ceph for
> our Openstack environment.
> 
> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3
> MONs) starts deep-scrubbing, it's impossible to work with the VMs.
> Currently, the deep-scrubs happen to start on Monday, which is unfortunate. I 
> already plan to start the next deep-scrub on
Saturday,
> so it has no impact on our work days. But if I imagine we had a large 
> multi-datacenter, such performance breaks are not
reasonable. So
> I'm wondering how do you guys manage that?
> 
> What I've tried so far:
> 
> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1'
> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7'
> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle'
> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0'
> ceph tell osd.* injectargs '--osd_scrub_end_hour 7'
> 
> And I also added these options to the ceph.conf.
> To be able to work again, I had to set the nodeep-scrub option and unset it 
> when I left the office. Today, I see the cluster deep-
> scrubbing again, but only one PG at a time, it seems that now the default for 
> osd_max_scrubs is working now and I don't see major
> impacts yet.
> 
> But is there something else I can do to reduce the performance impact?

If you are using Jewel, the scrubing is now done in the client IO thread, so 
those disk thread options won't do anything. Instead
there is a new priority setting, which seems to work for me, along with a few 
other settings.

osd_scrub_priority = 1
osd_scrub_sleep = .1
osd_scrub_chunk_min = 1
osd_scrub_chunk_max = 5
osd_scrub_load_threshold = 5

Also enabling the weighted priority queue can assist the new priority options

osd_op_queue = wpq


> I just found [1] and will have a look into it.
> 
> [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/
> 
> Thanks!
> Eugen
> 
> --
> Eugen Block voice   : +49-40-559 51 75
> NDE Netzdesign und -entwicklung AG  fax : +49-40-559 51 77
> Postfach 61 03 15
> D-22423 Hamburg e-mail  : ebl...@nde.ag
> 
>  Vorsitzende des Aufsichtsrates: Angelika Mozdzen
>Sitz und Registergericht: Hamburg, HRB 90934
>Vorstand: Jens-U. Mozdzen
> USt-IdNr. DE 814 013 983
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com