Re: [ceph-users] Deep scrub versus osd scrub load threshold

Christian Balzer Mon, 23 Jun 2014 23:11:34 -0700


Hello,


On Mon, 23 Jun 2014 21:50:50 -0700 David Zafman wrote:

> 
> By default osd_scrub_max_interval and osd_deep_scrub_interval are 1 week
> 604800 seconds (60*60*24*7) and osd_scrub_min_interval is 1 day 86400
> seconds (60*60*24).  As long as osd_scrub_max_interval <=
> osd_deep_scrub_interval then the load won’t impact when deep scrub
> occurs.   I suggest that osd_scrub_min_interval <=
> osd_scrub_max_interval <= osd_deep_scrub_interval.
> 
> I’d like to know how you have those 3 values set, so I can confirm that
> this explains the issue.
> 
They are and were unsurprisingly set to the default values.

Now to provide some more information, shortly after the inception of this
cluster I did initiate a deep scrub on all OSDs on 00:30 on a Sunday
morning (the things we do for Ceph, a scheduler with a variety of rules
would be nice, but I digress). 
This took until 05:30 despite the cluster being idle and with close to no
data in it. In retrospect it seems clear to me that this already was
influenced by the load threshold (a scrub I initiated with the new
threshold value of 1.5 finished in just 30 minutes last night).
Consequently all the normal scrubs happened in the same time frame until
this weekend on the 21st (normal scrub).
The deep scrub on the 22nd clearly ran into the load threshold.

So if I understand you correctly setting osd_scrub_max_interval to 6 days
should have deep scrubs ignore the load threshold as per the documentation?

Regards,

Christian

> 
> David Zafman
> Senior Developer
> http://www.inktank.com
> http://www.redhat.com
> 
> On Jun 23, 2014, at 7:01 PM, Christian Balzer <[email protected]> wrote:
> 
> > 
> > Hello,
> > 
> > On Mon, 23 Jun 2014 14:20:37 -0400 Gregory Farnum wrote:
> > 
> >> Looks like it's a doc error (at least on master), but it might have
> >> changed over time. If you're running Dumpling we should change the
> >> docs.
> > 
> > Nope, I'm running 0.80.1 currently.
> > 
> > Christian
> > 
> >> -Greg
> >> Software Engineer #42 @ http://inktank.com | http://ceph.com
> >> 
> >> 
> >> On Sun, Jun 22, 2014 at 10:18 PM, Christian Balzer <[email protected]>
> >> wrote:
> >>> 
> >>> Hello,
> >>> 
> >>> This weekend I noticed that the deep scrubbing took a lot longer than
> >>> usual (long periods without a scrub running/finishing), even though
> >>> the cluster wasn't all that busy.
> >>> It was however busier than in the past and the load average was above
> >>> 0.5 frequently.
> >>> 
> >>> Now according to the documentation "osd scrub load threshold" is
> >>> ignored when it comes to deep scrubs.
> >>> 
> >>> However after setting it to 1.5 and restarting the OSDs the
> >>> floodgates opened and all those deep scrubs are now running at full
> >>> speed.
> >>> 
> >>> Documentation error or did I "unstuck" something by the OSD restart?
> >>> 
> >>> Regards,
> >>> 
> >>> Christian
> >>> --
> >>> Christian Balzer        Network/Systems Engineer
> >>> [email protected]           Global OnLine Japan/Fusion Communications
> >>> http://www.gol.com/
> >>> _______________________________________________
> >>> ceph-users mailing list
> >>> [email protected]
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> 
> > 
> > 
> > -- 
> > Christian Balzer        Network/Systems Engineer                
> > [email protected]       Global OnLine Japan/Fusion Communications
> > http://www.gol.com/
> > _______________________________________________
> > ceph-users mailing list
> > [email protected]
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Christian Balzer        Network/Systems Engineer                
[email protected]           Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Deep scrub versus osd scrub load threshold

Reply via email to