On Tue, Jul 18, 2017 at 1:19 AM Dan van der Ster <[email protected]> wrote:
> On Fri, Jul 14, 2017 at 10:40 PM, Gregory Farnum <[email protected]> > wrote: > > On Fri, Jul 14, 2017 at 5:41 AM Dan van der Ster <[email protected]> > wrote: > >> > >> Hi, > >> > >> Occasionally we want to change the scrub schedule for a pool or whole > >> cluster, but we want to do this by injecting new settings without > >> restarting every daemon. > >> > >> I've noticed that in jewel, changes to scrub_min/max_interval and > >> deep_scrub_interval do not take immediate effect, presumably because > >> the scrub schedules are calculated in advance for all the PGs on an > >> OSD. > >> > >> Does anyone know how to list that scrub schedule for a given OSD? > > > > > > I'm not aware of any "scrub schedule" as such, just the constraints > around > > when new scrubbing happens. What exactly were you doing previously that > > isn't working now? > > Take this for example: > > 2017-07-18 10:03:30.600486 7f02f7a54700 20 osd.1 123582 > scrub_random_backoff lost coin flip, randomly backing off > 2017-07-18 10:03:31.600558 7f02f7a54700 20 osd.1 123582 > can_inc_scrubs_pending0 -> 1 (max 1, active 0) > 2017-07-18 10:03:31.600565 7f02f7a54700 20 osd.1 123582 > scrub_time_permit should run between 0 - 24 now 10 = yes > 2017-07-18 10:03:31.600592 7f02f7a54700 20 osd.1 123582 > scrub_load_below_threshold loadavg 0.85 < max 5 = yes > 2017-07-18 10:03:31.600603 7f02f7a54700 20 osd.1 123582 sched_scrub > load_is_low=1 > 2017-07-18 10:03:31.600605 7f02f7a54700 30 osd.1 123582 sched_scrub > examine 38.127 at 2017-07-18 10:08:01.148612 > 2017-07-18 10:03:31.600608 7f02f7a54700 10 osd.1 123582 sched_scrub > 38.127 scheduled at 2017-07-18 10:08:01.148612 > 2017-07-18 > 10:03:31.600562 > 2017-07-18 10:03:31.600611 7f02f7a54700 20 osd.1 123582 sched_scrub done > > PG 38.127 is the next registered scrub on osd.1. AFAICT, "registered" > means that there exists a ScrubJob for this PG, with a sched_time > (time of the last scrub + a random interval) and a deadline (time of > the last scrub + scrub max interval) > > (Question: how many scrubs are registered at a given time on an OSD? > Just this one that is printed in the tick loop, or several?) > > Anyway, I decrease the min and max scrub intervals for that pool, > hoping to make it scrub right away: > > # ceph osd pool set testing-images scrub_min_interval 60 set pool 38 > scrub_min_interval to 60 > set pool 38 scrub_min_interval to 60 > # ceph osd pool set testing-images scrub_max_interval 86400 > set pool 38 scrub_max_interval to 86400 > > > But the registered ScrubJob doesn't change -- what I called the "scrub > schedule" doesn't change: > > 2017-07-18 10:06:53.622286 7f02f7a54700 20 osd.1 123584 > scrub_random_backoff lost coin flip, randomly backing off > 2017-07-18 10:06:54.622403 7f02f7a54700 20 osd.1 123584 > can_inc_scrubs_pending0 -> 1 (max 1, active 0) > 2017-07-18 10:06:54.622409 7f02f7a54700 20 osd.1 123584 > scrub_time_permit should run between 0 - 24 now 10 = yes > 2017-07-18 10:06:54.622436 7f02f7a54700 20 osd.1 123584 > scrub_load_below_threshold loadavg 1.16 < max 5 = yes > 2017-07-18 10:06:54.622446 7f02f7a54700 20 osd.1 123584 sched_scrub > load_is_low=1 > 2017-07-18 10:06:54.622449 7f02f7a54700 30 osd.1 123584 sched_scrub > examine 38.127 at 2017-07-18 10:08:01.148612 > 2017-07-18 10:06:54.622452 7f02f7a54700 10 osd.1 123584 sched_scrub > 38.127 scheduled at 2017-07-18 10:08:01.148612 > 2017-07-18 > 10:06:54.622408 > 2017-07-18 10:06:54.622455 7f02f7a54700 20 osd.1 123584 sched_scrub done > > > I'm looking for a way to reset those registered scrubs, so that the > new intervals can take effect (without restarting OSDs). > > Unfortunately there's not a good way to manually reschedule scrubbing that I can see. That would be a good ticket! It *does* unregister the existing ScrubJob when it starts peering the PG, and registers a new ScrubJob when the PG goes active. So if you've got a good way to induce one of those you don't technically need to restart OSDs. I can't off-hand think of a good way to do that without doing something that's at least as disruptive as a polite OSD restart though. -Greg > > Cheers, Dan > > > > >> > >> > >> And better yet, does anyone know a way to reset that schedule, so that > >> the OSD generates a new one with the new configuration? > >> > >> (I've noticed that by chance setting sortbitwise triggers many scrubs > >> -- maybe a new peering interval resets the scrub schedules?) Any > >> non-destructive way to trigger a new peering interval on demand? > >> > >> Cheers, > >> > >> Dan > >> _______________________________________________ > >> ceph-users mailing list > >> [email protected] > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
