On Tue, Jul 18, 2017 at 1:19 AM Dan van der Ster <[email protected]> wrote:

> On Fri, Jul 14, 2017 at 10:40 PM, Gregory Farnum <[email protected]>
> wrote:
> > On Fri, Jul 14, 2017 at 5:41 AM Dan van der Ster <[email protected]>
> wrote:
> >>
> >> Hi,
> >>
> >> Occasionally we want to change the scrub schedule for a pool or whole
> >> cluster, but we want to do this by injecting new settings without
> >> restarting every daemon.
> >>
> >> I've noticed that in jewel, changes to scrub_min/max_interval and
> >> deep_scrub_interval do not take immediate effect, presumably because
> >> the scrub schedules are calculated in advance for all the PGs on an
> >> OSD.
> >>
> >> Does anyone know how to list that scrub schedule for a given OSD?
> >
> >
> > I'm not aware of any "scrub schedule" as such, just the constraints
> around
> > when new scrubbing happens. What exactly were you doing previously that
> > isn't working now?
>
> Take this for example:
>
> 2017-07-18 10:03:30.600486 7f02f7a54700 20 osd.1 123582
> scrub_random_backoff lost coin flip, randomly backing off
> 2017-07-18 10:03:31.600558 7f02f7a54700 20 osd.1 123582
> can_inc_scrubs_pending0 -> 1 (max 1, active 0)
> 2017-07-18 10:03:31.600565 7f02f7a54700 20 osd.1 123582
> scrub_time_permit should run between 0 - 24 now 10 = yes
> 2017-07-18 10:03:31.600592 7f02f7a54700 20 osd.1 123582
> scrub_load_below_threshold loadavg 0.85 < max 5 = yes
> 2017-07-18 10:03:31.600603 7f02f7a54700 20 osd.1 123582 sched_scrub
> load_is_low=1
> 2017-07-18 10:03:31.600605 7f02f7a54700 30 osd.1 123582 sched_scrub
> examine 38.127 at 2017-07-18 10:08:01.148612
> 2017-07-18 10:03:31.600608 7f02f7a54700 10 osd.1 123582 sched_scrub
> 38.127 scheduled at 2017-07-18 10:08:01.148612 > 2017-07-18
> 10:03:31.600562
> 2017-07-18 10:03:31.600611 7f02f7a54700 20 osd.1 123582 sched_scrub done
>
> PG 38.127 is the next registered scrub on osd.1. AFAICT, "registered"
> means that there exists a ScrubJob for this PG, with a sched_time
> (time of the last scrub + a random interval) and a deadline (time of
> the last scrub + scrub max interval)
>
> (Question: how many scrubs are registered at a given time on an OSD?
> Just this one that is printed in the tick loop, or several?)
>
> Anyway, I decrease the min and max scrub intervals for that pool,
> hoping to make it scrub right away:
>
> # ceph osd pool set testing-images scrub_min_interval 60 set pool 38
> scrub_min_interval to 60
> set pool 38 scrub_min_interval to 60
> # ceph osd pool set testing-images scrub_max_interval 86400
> set pool 38 scrub_max_interval to 86400
>
>
> But the registered ScrubJob doesn't change -- what I called the "scrub
> schedule" doesn't change:
>
> 2017-07-18 10:06:53.622286 7f02f7a54700 20 osd.1 123584
> scrub_random_backoff lost coin flip, randomly backing off
> 2017-07-18 10:06:54.622403 7f02f7a54700 20 osd.1 123584
> can_inc_scrubs_pending0 -> 1 (max 1, active 0)
> 2017-07-18 10:06:54.622409 7f02f7a54700 20 osd.1 123584
> scrub_time_permit should run between 0 - 24 now 10 = yes
> 2017-07-18 10:06:54.622436 7f02f7a54700 20 osd.1 123584
> scrub_load_below_threshold loadavg 1.16 < max 5 = yes
> 2017-07-18 10:06:54.622446 7f02f7a54700 20 osd.1 123584 sched_scrub
> load_is_low=1
> 2017-07-18 10:06:54.622449 7f02f7a54700 30 osd.1 123584 sched_scrub
> examine 38.127 at 2017-07-18 10:08:01.148612
> 2017-07-18 10:06:54.622452 7f02f7a54700 10 osd.1 123584 sched_scrub
> 38.127 scheduled at 2017-07-18 10:08:01.148612 > 2017-07-18
> 10:06:54.622408
> 2017-07-18 10:06:54.622455 7f02f7a54700 20 osd.1 123584 sched_scrub done
>
>
> I'm looking for a way to reset those registered scrubs, so that the
> new intervals can take effect (without restarting OSDs).
>
>
Unfortunately there's not a good way to manually reschedule scrubbing that
I can see. That would be a good ticket!

It *does* unregister the existing ScrubJob when it starts peering the PG,
and registers a new ScrubJob when the PG goes active. So if you've got a
good way to induce one of those you don't technically need to restart OSDs.
I can't off-hand think of a good way to do that without doing something
that's at least as disruptive as a polite OSD restart though.
-Greg



>
> Cheers, Dan
>
> >
> >>
> >>
> >> And better yet, does anyone know a way to reset that schedule, so that
> >> the OSD generates a new one with the new configuration?
> >>
> >> (I've noticed that by chance setting sortbitwise triggers many scrubs
> >> -- maybe a new peering interval resets the scrub schedules?) Any
> >> non-destructive way to trigger a new peering interval on demand?
> >>
> >> Cheers,
> >>
> >> Dan
> >> _______________________________________________
> >> ceph-users mailing list
> >> [email protected]
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to