Re: [ceph-users] deep-scrubbing has large impact on performance
Actually this might suggest that caution should be taken before enabling this at the moment http://tracker.ceph.com/issues/15774 > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Nick > Fisk > Sent: 23 November 2016 11:17 > To: 'Robert LeBlanc' <rob...@leblancnet.us>; 'Eugen Block' <ebl...@nde.ag> > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] deep-scrubbing has large impact on performance > > Thanks for the tip Robert, much appreciated. > > > -Original Message- > > From: Robert LeBlanc [mailto:rob...@leblancnet.us] > > Sent: 23 November 2016 00:54 > > To: Eugen Block <ebl...@nde.ag> > > Cc: Nick Fisk <n...@fisk.me.uk>; ceph-users@lists.ceph.com > > Subject: Re: [ceph-users] deep-scrubbing has large impact on > > performance > > > > If you use wpq, I recommend also setting "osd_op_queue_cut_off = high" > > as well, otherwise replication OPs are not weighted and really reduces the > > benefit of wpq. > > > > Robert LeBlanc > > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > > > > On Tue, Nov 22, 2016 at 5:34 AM, Eugen Block <ebl...@nde.ag> wrote: > > > Thank you! > > > > > > > > > Zitat von Nick Fisk <n...@fisk.me.uk>: > > > > > >>> -Original Message----- > > >>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On > > >>> Behalf Of Eugen Block > > >>> Sent: 22 November 2016 10:11 > > >>> To: Nick Fisk <n...@fisk.me.uk> > > >>> Cc: ceph-users@lists.ceph.com > > >>> Subject: Re: [ceph-users] deep-scrubbing has large impact on > > >>> performance > > >>> > > >>> Thanks for the very quick answer! > > >>> > > >>> > If you are using Jewel > > >>> > > >>> We are still using Hammer (0.94.7), we wanted to upgrade to Jewel > > >>> in a couple of weeks, would you recommend to do it now? > > >> > > >> > > >> It's been fairly solid for me, but you might want to wait for the > > >> scrubbing hang bug to be fixed before upgrading. I think this might > > >> be fixed in the upcoming 10.2.4 release. > > >> > > >>> > > >>> > > >>> Zitat von Nick Fisk <n...@fisk.me.uk>: > > >>> > > >>> >> -Original Message- > > >>> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On > > >>> >> Behalf Of Eugen Block > > >>> >> Sent: 22 November 2016 09:55 > > >>> >> To: ceph-users@lists.ceph.com > > >>> >> Subject: [ceph-users] deep-scrubbing has large impact on > > >>> >> performance > > >>> >> > > >>> >> Hi list, > > >>> >> > > >>> >> I've been searching the mail archive and the web for some help. > > >>> >> I tried the things I found, but I can't see the effects. We use > > >>> > Ceph for > > >>> >> our Openstack environment. > > >>> >> > > >>> >> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 > > >>> >> nodes, > > >>> >> 3 > > >>> >> MONs) starts deep-scrubbing, it's impossible to work with the VMs. > > >>> >> Currently, the deep-scrubs happen to start on Monday, which is > > >>> >> unfortunate. I already plan to start the next deep-scrub on > > >>> > Saturday, > > >>> >> so it has no impact on our work days. But if I imagine we had a > > >>> >> large multi-datacenter, such performance breaks are not > > >>> > reasonable. So > > >>> >> I'm wondering how do you guys manage that? > > >>> >> > > >>> >> What I've tried so far: > > >>> >> > > >>> >> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' > > >>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' > > >>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' > > >>> >> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' > > >>> >> ceph tell osd.* injectargs '--osd_scrub_end_hour 7' &g
Re: [ceph-users] deep-scrubbing has large impact on performance
Thanks for the tip Robert, much appreciated. > -Original Message- > From: Robert LeBlanc [mailto:rob...@leblancnet.us] > Sent: 23 November 2016 00:54 > To: Eugen Block <ebl...@nde.ag> > Cc: Nick Fisk <n...@fisk.me.uk>; ceph-users@lists.ceph.com > Subject: Re: [ceph-users] deep-scrubbing has large impact on performance > > If you use wpq, I recommend also setting "osd_op_queue_cut_off = high" > as well, otherwise replication OPs are not weighted and really reduces the > benefit of wpq. > > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Tue, Nov 22, 2016 at 5:34 AM, Eugen Block <ebl...@nde.ag> wrote: > > Thank you! > > > > > > Zitat von Nick Fisk <n...@fisk.me.uk>: > > > >>> -Original Message- > >>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On > >>> Behalf Of Eugen Block > >>> Sent: 22 November 2016 10:11 > >>> To: Nick Fisk <n...@fisk.me.uk> > >>> Cc: ceph-users@lists.ceph.com > >>> Subject: Re: [ceph-users] deep-scrubbing has large impact on > >>> performance > >>> > >>> Thanks for the very quick answer! > >>> > >>> > If you are using Jewel > >>> > >>> We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in > >>> a couple of weeks, would you recommend to do it now? > >> > >> > >> It's been fairly solid for me, but you might want to wait for the > >> scrubbing hang bug to be fixed before upgrading. I think this might > >> be fixed in the upcoming 10.2.4 release. > >> > >>> > >>> > >>> Zitat von Nick Fisk <n...@fisk.me.uk>: > >>> > >>> >> -Original Message- > >>> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On > >>> >> Behalf Of Eugen Block > >>> >> Sent: 22 November 2016 09:55 > >>> >> To: ceph-users@lists.ceph.com > >>> >> Subject: [ceph-users] deep-scrubbing has large impact on > >>> >> performance > >>> >> > >>> >> Hi list, > >>> >> > >>> >> I've been searching the mail archive and the web for some help. I > >>> >> tried the things I found, but I can't see the effects. We use > >>> > Ceph for > >>> >> our Openstack environment. > >>> >> > >>> >> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, > >>> >> 3 > >>> >> MONs) starts deep-scrubbing, it's impossible to work with the VMs. > >>> >> Currently, the deep-scrubs happen to start on Monday, which is > >>> >> unfortunate. I already plan to start the next deep-scrub on > >>> > Saturday, > >>> >> so it has no impact on our work days. But if I imagine we had a > >>> >> large multi-datacenter, such performance breaks are not > >>> > reasonable. So > >>> >> I'm wondering how do you guys manage that? > >>> >> > >>> >> What I've tried so far: > >>> >> > >>> >> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' > >>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' > >>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' > >>> >> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' > >>> >> ceph tell osd.* injectargs '--osd_scrub_end_hour 7' > >>> >> > >>> >> And I also added these options to the ceph.conf. > >>> >> To be able to work again, I had to set the nodeep-scrub option > >>> >> and unset it when I left the office. Today, I see the cluster > >>> >> deep- scrubbing again, but only one PG at a time, it seems that > >>> >> now the default for osd_max_scrubs is working now and I don't see > >>> >> major impacts yet. > >>> >> > >>> >> But is there something else I can do to reduce the performance impact? > >>> > > >>> > If you are using Jewel, the scrubing is now done in the client IO > >>> > thread, so those disk thread options won't do anything. Instead > >>> > there is a new priority setting, which seems to work for me, along > >>> > w
Re: [ceph-users] deep-scrubbing has large impact on performance
If you use wpq, I recommend also setting "osd_op_queue_cut_off = high" as well, otherwise replication OPs are not weighted and really reduces the benefit of wpq. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Nov 22, 2016 at 5:34 AM, Eugen Block <ebl...@nde.ag> wrote: > Thank you! > > > Zitat von Nick Fisk <n...@fisk.me.uk>: > >>> -Original Message- >>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >>> Eugen Block >>> Sent: 22 November 2016 10:11 >>> To: Nick Fisk <n...@fisk.me.uk> >>> Cc: ceph-users@lists.ceph.com >>> Subject: Re: [ceph-users] deep-scrubbing has large impact on performance >>> >>> Thanks for the very quick answer! >>> >>> > If you are using Jewel >>> >>> We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in a >>> couple of weeks, would you recommend to do it now? >> >> >> It's been fairly solid for me, but you might want to wait for the >> scrubbing hang bug to be fixed before upgrading. I think this >> might be fixed in the upcoming 10.2.4 release. >> >>> >>> >>> Zitat von Nick Fisk <n...@fisk.me.uk>: >>> >>> >> -Original Message- >>> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf >>> >> Of Eugen Block >>> >> Sent: 22 November 2016 09:55 >>> >> To: ceph-users@lists.ceph.com >>> >> Subject: [ceph-users] deep-scrubbing has large impact on performance >>> >> >>> >> Hi list, >>> >> >>> >> I've been searching the mail archive and the web for some help. I >>> >> tried the things I found, but I can't see the effects. We use >>> > Ceph for >>> >> our Openstack environment. >>> >> >>> >> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3 >>> >> MONs) starts deep-scrubbing, it's impossible to work with the VMs. >>> >> Currently, the deep-scrubs happen to start on Monday, which is >>> >> unfortunate. I already plan to start the next deep-scrub on >>> > Saturday, >>> >> so it has no impact on our work days. But if I imagine we had a large >>> >> multi-datacenter, such performance breaks are not >>> > reasonable. So >>> >> I'm wondering how do you guys manage that? >>> >> >>> >> What I've tried so far: >>> >> >>> >> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' >>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' >>> >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' >>> >> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' >>> >> ceph tell osd.* injectargs '--osd_scrub_end_hour 7' >>> >> >>> >> And I also added these options to the ceph.conf. >>> >> To be able to work again, I had to set the nodeep-scrub option and >>> >> unset it when I left the office. Today, I see the cluster deep- >>> >> scrubbing again, but only one PG at a time, it seems that now the >>> >> default for osd_max_scrubs is working now and I don't see major >>> >> impacts yet. >>> >> >>> >> But is there something else I can do to reduce the performance impact? >>> > >>> > If you are using Jewel, the scrubing is now done in the client IO >>> > thread, so those disk thread options won't do anything. Instead there >>> > is a new priority setting, which seems to work for me, along with a >>> > few other settings. >>> > >>> > osd_scrub_priority = 1 >>> > osd_scrub_sleep = .1 >>> > osd_scrub_chunk_min = 1 >>> > osd_scrub_chunk_max = 5 >>> > osd_scrub_load_threshold = 5 >>> > >>> > Also enabling the weighted priority queue can assist the new priority >>> > options >>> > >>> > osd_op_queue = wpq >>> > >>> > >>> >> I just found [1] and will have a look into it. >>> >> >>> >> [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/ >>> >> >>> >> Thanks! >>> >> Eugen >>> >> >>> >> -- >>> >> Eugen Block voice : +49-40-559 51 75 >>> >&g
Re: [ceph-users] deep-scrubbing has large impact on performance
Thank you! Zitat von Nick Fisk <n...@fisk.me.uk>: -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Eugen Block Sent: 22 November 2016 10:11 To: Nick Fisk <n...@fisk.me.uk> Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] deep-scrubbing has large impact on performance Thanks for the very quick answer! > If you are using Jewel We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in a couple of weeks, would you recommend to do it now? It's been fairly solid for me, but you might want to wait for the scrubbing hang bug to be fixed before upgrading. I think this might be fixed in the upcoming 10.2.4 release. Zitat von Nick Fisk <n...@fisk.me.uk>: >> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf >> Of Eugen Block >> Sent: 22 November 2016 09:55 >> To: ceph-users@lists.ceph.com >> Subject: [ceph-users] deep-scrubbing has large impact on performance >> >> Hi list, >> >> I've been searching the mail archive and the web for some help. I >> tried the things I found, but I can't see the effects. We use > Ceph for >> our Openstack environment. >> >> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3 >> MONs) starts deep-scrubbing, it's impossible to work with the VMs. >> Currently, the deep-scrubs happen to start on Monday, which is >> unfortunate. I already plan to start the next deep-scrub on > Saturday, >> so it has no impact on our work days. But if I imagine we had a large >> multi-datacenter, such performance breaks are not > reasonable. So >> I'm wondering how do you guys manage that? >> >> What I've tried so far: >> >> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' >> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' >> ceph tell osd.* injectargs '--osd_scrub_end_hour 7' >> >> And I also added these options to the ceph.conf. >> To be able to work again, I had to set the nodeep-scrub option and >> unset it when I left the office. Today, I see the cluster deep- >> scrubbing again, but only one PG at a time, it seems that now the >> default for osd_max_scrubs is working now and I don't see major >> impacts yet. >> >> But is there something else I can do to reduce the performance impact? > > If you are using Jewel, the scrubing is now done in the client IO > thread, so those disk thread options won't do anything. Instead there > is a new priority setting, which seems to work for me, along with a > few other settings. > > osd_scrub_priority = 1 > osd_scrub_sleep = .1 > osd_scrub_chunk_min = 1 > osd_scrub_chunk_max = 5 > osd_scrub_load_threshold = 5 > > Also enabling the weighted priority queue can assist the new priority > options > > osd_op_queue = wpq > > >> I just found [1] and will have a look into it. >> >> [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/ >> >> Thanks! >> Eugen >> >> -- >> Eugen Block voice : +49-40-559 51 75 >> NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 >> Postfach 61 03 15 >> D-22423 Hamburg e-mail : ebl...@nde.ag >> >> Vorsitzende des Aufsichtsrates: Angelika Mozdzen >>Sitz und Registergericht: Hamburg, HRB 90934 >>Vorstand: Jens-U. Mozdzen >> USt-IdNr. DE 814 013 983 >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : ebl...@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : ebl...@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] deep-scrubbing has large impact on performance
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Eugen Block > Sent: 22 November 2016 10:11 > To: Nick Fisk <n...@fisk.me.uk> > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] deep-scrubbing has large impact on performance > > Thanks for the very quick answer! > > > If you are using Jewel > > We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in a couple > of weeks, would you recommend to do it now? It's been fairly solid for me, but you might want to wait for the scrubbing hang bug to be fixed before upgrading. I think this might be fixed in the upcoming 10.2.4 release. > > > Zitat von Nick Fisk <n...@fisk.me.uk>: > > >> -Original Message- > >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > >> Of Eugen Block > >> Sent: 22 November 2016 09:55 > >> To: ceph-users@lists.ceph.com > >> Subject: [ceph-users] deep-scrubbing has large impact on performance > >> > >> Hi list, > >> > >> I've been searching the mail archive and the web for some help. I > >> tried the things I found, but I can't see the effects. We use > > Ceph for > >> our Openstack environment. > >> > >> When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3 > >> MONs) starts deep-scrubbing, it's impossible to work with the VMs. > >> Currently, the deep-scrubs happen to start on Monday, which is > >> unfortunate. I already plan to start the next deep-scrub on > > Saturday, > >> so it has no impact on our work days. But if I imagine we had a large > >> multi-datacenter, such performance breaks are not > > reasonable. So > >> I'm wondering how do you guys manage that? > >> > >> What I've tried so far: > >> > >> ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' > >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' > >> ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' > >> ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' > >> ceph tell osd.* injectargs '--osd_scrub_end_hour 7' > >> > >> And I also added these options to the ceph.conf. > >> To be able to work again, I had to set the nodeep-scrub option and > >> unset it when I left the office. Today, I see the cluster deep- > >> scrubbing again, but only one PG at a time, it seems that now the > >> default for osd_max_scrubs is working now and I don't see major > >> impacts yet. > >> > >> But is there something else I can do to reduce the performance impact? > > > > If you are using Jewel, the scrubing is now done in the client IO > > thread, so those disk thread options won't do anything. Instead there > > is a new priority setting, which seems to work for me, along with a > > few other settings. > > > > osd_scrub_priority = 1 > > osd_scrub_sleep = .1 > > osd_scrub_chunk_min = 1 > > osd_scrub_chunk_max = 5 > > osd_scrub_load_threshold = 5 > > > > Also enabling the weighted priority queue can assist the new priority > > options > > > > osd_op_queue = wpq > > > > > >> I just found [1] and will have a look into it. > >> > >> [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/ > >> > >> Thanks! > >> Eugen > >> > >> -- > >> Eugen Block voice : +49-40-559 51 75 > >> NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 > >> Postfach 61 03 15 > >> D-22423 Hamburg e-mail : ebl...@nde.ag > >> > >> Vorsitzende des Aufsichtsrates: Angelika Mozdzen > >>Sitz und Registergericht: Hamburg, HRB 90934 > >>Vorstand: Jens-U. Mozdzen > >> USt-IdNr. DE 814 013 983 > >> > >> ___ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- > Eugen Block voice : +49-40-559 51 75 > NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 > Postfach 61 03 15 > D-22423 Hamburg e-mail : ebl...@nde.ag > > Vorsitzende des Aufsichtsrates: Angelika Mozdzen >Sitz und Registergericht: Hamburg, HRB 90934 >Vorstand: Jens-U. Mozdzen > USt-IdNr. DE 814 013 983 > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] deep-scrubbing has large impact on performance
Thanks for the very quick answer! If you are using Jewel We are still using Hammer (0.94.7), we wanted to upgrade to Jewel in a couple of weeks, would you recommend to do it now? Zitat von Nick Fisk: -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Eugen Block Sent: 22 November 2016 09:55 To: ceph-users@lists.ceph.com Subject: [ceph-users] deep-scrubbing has large impact on performance Hi list, I've been searching the mail archive and the web for some help. I tried the things I found, but I can't see the effects. We use Ceph for our Openstack environment. When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3 MONs) starts deep-scrubbing, it's impossible to work with the VMs. Currently, the deep-scrubs happen to start on Monday, which is unfortunate. I already plan to start the next deep-scrub on Saturday, so it has no impact on our work days. But if I imagine we had a large multi-datacenter, such performance breaks are not reasonable. So I'm wondering how do you guys manage that? What I've tried so far: ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' ceph tell osd.* injectargs '--osd_scrub_end_hour 7' And I also added these options to the ceph.conf. To be able to work again, I had to set the nodeep-scrub option and unset it when I left the office. Today, I see the cluster deep- scrubbing again, but only one PG at a time, it seems that now the default for osd_max_scrubs is working now and I don't see major impacts yet. But is there something else I can do to reduce the performance impact? If you are using Jewel, the scrubing is now done in the client IO thread, so those disk thread options won't do anything. Instead there is a new priority setting, which seems to work for me, along with a few other settings. osd_scrub_priority = 1 osd_scrub_sleep = .1 osd_scrub_chunk_min = 1 osd_scrub_chunk_max = 5 osd_scrub_load_threshold = 5 Also enabling the weighted priority queue can assist the new priority options osd_op_queue = wpq I just found [1] and will have a look into it. [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/ Thanks! Eugen -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : ebl...@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : ebl...@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] deep-scrubbing has large impact on performance
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Eugen Block > Sent: 22 November 2016 09:55 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] deep-scrubbing has large impact on performance > > Hi list, > > I've been searching the mail archive and the web for some help. I tried the > things I found, but I can't see the effects. We use Ceph for > our Openstack environment. > > When our cluster (2 pools, each 4092 PGs, in 20 OSDs on 4 nodes, 3 > MONs) starts deep-scrubbing, it's impossible to work with the VMs. > Currently, the deep-scrubs happen to start on Monday, which is unfortunate. I > already plan to start the next deep-scrub on Saturday, > so it has no impact on our work days. But if I imagine we had a large > multi-datacenter, such performance breaks are not reasonable. So > I'm wondering how do you guys manage that? > > What I've tried so far: > > ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' > ceph tell osd.* injectargs '--osd_disk_thread_ioprio_priority 7' > ceph tell osd.* injectargs '--osd_disk_thread_ioprio_class idle' > ceph tell osd.* injectargs '--osd_scrub_begin_hour 0' > ceph tell osd.* injectargs '--osd_scrub_end_hour 7' > > And I also added these options to the ceph.conf. > To be able to work again, I had to set the nodeep-scrub option and unset it > when I left the office. Today, I see the cluster deep- > scrubbing again, but only one PG at a time, it seems that now the default for > osd_max_scrubs is working now and I don't see major > impacts yet. > > But is there something else I can do to reduce the performance impact? If you are using Jewel, the scrubing is now done in the client IO thread, so those disk thread options won't do anything. Instead there is a new priority setting, which seems to work for me, along with a few other settings. osd_scrub_priority = 1 osd_scrub_sleep = .1 osd_scrub_chunk_min = 1 osd_scrub_chunk_max = 5 osd_scrub_load_threshold = 5 Also enabling the weighted priority queue can assist the new priority options osd_op_queue = wpq > I just found [1] and will have a look into it. > > [1] http://prob6.com/en/ceph-pg-deep-scrub-cron/ > > Thanks! > Eugen > > -- > Eugen Block voice : +49-40-559 51 75 > NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 > Postfach 61 03 15 > D-22423 Hamburg e-mail : ebl...@nde.ag > > Vorsitzende des Aufsichtsrates: Angelika Mozdzen >Sitz und Registergericht: Hamburg, HRB 90934 >Vorstand: Jens-U. Mozdzen > USt-IdNr. DE 814 013 983 > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com