+ceph-users. Does anybody have the similar experience of scrubbing / deep-scrubbing?
Thanks, Guang On Jan 29, 2014, at 10:35 AM, Guang <[email protected]> wrote: > Glad to see there are some discussion around scrubbing / deep-scrubbing. > > We are experiencing the same that scrubbing could affect latency quite a bit > and so far I found two slow patterns (dump_historic_ops): 1) waiting from > being dispatched 2) waiting in the op working queue to be fetched by an > available op thread. For the first slow pattern, it looks like there is lock > (as dispatcher stop working for 2 seconds and then resume, same for scrubber > thread), that needs further investigation. For the second slow pattern, as > scrubbing brings more ops (for scrubbing check), that make the op thread's > work load increase (client op has a lower priority), I think that could be > improved by increasing the op thread number, I will confirm this analysis by > adding more op threads and turn on scrubbing on OSD basis. > > Does the above observation and analysis make sense? > > Thanks, > Guang > > On Jan 29, 2014, at 2:13 AM, Filippos Giannakos <[email protected]> wrote: > >> On Mon, Jan 27, 2014 at 10:45:48AM -0800, Sage Weil wrote: >>> There is also >>> >>> ceph osd set noscrub >>> >>> and then later >>> >>> ceph osd unset noscrub >>> >>> I forget whether this pauses an in-progress PG scrub or just makes it stop >>> when it gets to the next PG boundary. >>> >>> sage >> >> I bumped into those settings but I couldn't find any documentation about >> them. >> When I first tried them, they didn't do anything immediately, so I thought >> they >> weren't the answer. After your mention, I tried them again, and after a while >> the deep-scrubbing stopped. So I'm guessing they stop scrubbing on the next >> PG >> boundary. >> >> I see from this thread and others before, that some people think it is a >> spindle >> issue. I'm not sure that it is just that. Replicating it to an idle cluster >> that >> can do more than 250MiB/seconds and pausing for 4-5 seconds on a single >> request, >> sounds like an issue by itself. Maybe there is too much locking or not enough >> priority to the actual I/O ? Plus, that idea of throttling deep scrubbing >> based >> on the iops sounds appealing. >> >> Kind Regards, >> -- >> Filippos >> <[email protected]> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to [email protected] >> More majordomo info at http://vger.kernel.org/majordomo-info.html >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
