Re: [ceph-users] stalls caused by scrub on jewel

2016-12-06 Thread Sage Weil
On Tue, 6 Dec 2016, Dan van der Ster wrote: > Hi Sage, > > Could you please clarify: do we need to set nodeep-scrub also, or does > this somehow only affect the (shallow) scrub? > > (Note that deep scrubs will start when the deep_scrub_interval has > passed, even with noscrub set). Hmm, I

Re: [ceph-users] stalls caused by scrub on jewel

2016-12-02 Thread Dan Jakubiec
> On Dec 2, 2016, at 10:48, Sage Weil wrote: > > On Fri, 2 Dec 2016, Dan Jakubiec wrote: >> For what it's worth... this sounds like the condition we hit we >> re-enabled scrub on our 16 OSDs (after 6 to 8 weeks of noscrub). They >> flapped for about 30 minutes as most of

Re: [ceph-users] stalls caused by scrub on jewel

2016-12-02 Thread Sage Weil
On Fri, 2 Dec 2016, Dan Jakubiec wrote: > For what it's worth... this sounds like the condition we hit we > re-enabled scrub on our 16 OSDs (after 6 to 8 weeks of noscrub). They > flapped for about 30 minutes as most of the OSDs randomly hit suicide > timeouts here and there. > > This settled

Re: [ceph-users] stalls caused by scrub on jewel

2016-12-01 Thread Frédéric Nass
Hi Yoann, Thank you for your input. I was just told by RH support that it’s gonna make it to RHCS 2.0 (10.2.3). Thank you guys for the fix ! We thought about increasing the number of PGs just after changing the merge/split threshold values but this would have led to a _lot_ of data movements

Re: [ceph-users] stalls caused by scrub on jewel

2016-12-01 Thread Vasu Kulkarni
On Thu, Dec 1, 2016 at 7:24 AM, Frédéric Nass < frederic.n...@univ-lorraine.fr> wrote: > > Hi Sage, Sam, > > We're impacted by this bug (case 01725311). Our cluster is running RHCS > 2.0 and is no more capable to scrub neither deep-scrub. > > [1] http://tracker.ceph.com/issues/17859 > [2]

Re: [ceph-users] stalls caused by scrub on jewel

2016-12-01 Thread Yoann Moulin
Hello, > We're impacted by this bug (case 01725311). Our cluster is running RHCS 2.0 > and is no more capable to scrub neither deep-scrub. > > [1] http://tracker.ceph.com/issues/17859 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1394007 > [3] https://github.com/ceph/ceph/pull/11898 > >

Re: [ceph-users] stalls caused by scrub on jewel

2016-12-01 Thread Frédéric Nass
Hi Sage, Sam, We're impacted by this bug (case 01725311). Our cluster is running RHCS 2.0 and is no more capable to scrub neither deep-scrub. [1] http://tracker.ceph.com/issues/17859 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1394007 [3] https://github.com/ceph/ceph/pull/11898 I'm