Re: [ceph-users] stalls caused by scrub on jewel
On Tue, 6 Dec 2016, Dan van der Ster wrote: > Hi Sage, > > Could you please clarify: do we need to set nodeep-scrub also, or does > this somehow only affect the (shallow) scrub? > > (Note that deep scrubs will start when the deep_scrub_interval has > passed, even with noscrub set). Hmm, I thought that 'noscrub' would also stop deep scrubs, but I just looked at the code and I was wrong. So you should set nodeep-scrub too! sage > > Cheers, Dan > > > On Tue, Nov 15, 2016 at 11:35 PM, Sage Weil wrote: > > Hi everyone, > > > > There was a regression in jewel that can trigger long OSD stalls during > > scrub. How long the stalls are depends on how many objects are in your > > PGs, how fast your storage device is, and what is cached, but in at least > > one case they were long enough that the OSD internal heartbeat check > > failed and it committed suicide (120 seconds). > > > > The workaround for now is to simply > > > > ceph osd set noscrub > > > > as the bug is only triggered by scrub. A fix is being tested and will be > > available shortly. > > > > If you've seen any kind of weird latencies or slow requests on jewel, I > > suggest setting noscrub and seeing if they go away! > > > > The tracker bug is > > > > http://tracker.ceph.com/issues/17859 > > > > Big thanks to Yoann Moulin for helping track this down! > > > > sage > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] stalls caused by scrub on jewel
Hi Sage, Could you please clarify: do we need to set nodeep-scrub also, or does this somehow only affect the (shallow) scrub? (Note that deep scrubs will start when the deep_scrub_interval has passed, even with noscrub set). Cheers, Dan On Tue, Nov 15, 2016 at 11:35 PM, Sage Weil wrote: > Hi everyone, > > There was a regression in jewel that can trigger long OSD stalls during > scrub. How long the stalls are depends on how many objects are in your > PGs, how fast your storage device is, and what is cached, but in at least > one case they were long enough that the OSD internal heartbeat check > failed and it committed suicide (120 seconds). > > The workaround for now is to simply > > ceph osd set noscrub > > as the bug is only triggered by scrub. A fix is being tested and will be > available shortly. > > If you've seen any kind of weird latencies or slow requests on jewel, I > suggest setting noscrub and seeing if they go away! > > The tracker bug is > > http://tracker.ceph.com/issues/17859 > > Big thanks to Yoann Moulin for helping track this down! > > sage > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] stalls caused by scrub on jewel
> On Dec 2, 2016, at 10:48, Sage Weil wrote: > > On Fri, 2 Dec 2016, Dan Jakubiec wrote: >> For what it's worth... this sounds like the condition we hit we >> re-enabled scrub on our 16 OSDs (after 6 to 8 weeks of noscrub). They >> flapped for about 30 minutes as most of the OSDs randomly hit suicide >> timeouts here and there. >> >> This settled down after about an hour and the OSDs stopped dying. We >> have since left scrub enabled for about 4 days and have only seen three >> small spurts of OSD flapping since then (which quickly resolved >> themselves). > > Yeah. I think what's happening is that with a cold cache it is slow > enough to suicide, but with a warm cache it manages to complete (although > I bet it's still stalling other client IO for perhaps multiple seconds). > I would leave noscrub set for now. Ah... thanks for the suggestion! We are indeed working through some jerky performance issues. Perhaps this is a layer of that onion, thank you. -- Dan > > sage > > > > >> >> -- Dan >> >>> On Dec 1, 2016, at 14:38, Frédéric Nass >>> wrote: >>> >>> Hi Yoann, >>> >>> Thank you for your input. I was just told by RH support that it’s gonna >>> make it to RHCS 2.0 (10.2.3). Thank you guys for the fix ! >>> >>> We thought about increasing the number of PGs just after changing the >>> merge/split threshold values but this would have led to a _lot_ of data >>> movements (1.2 billion of XFS files) over weeks, without any possibility to >>> scrub / deep-scrub to ensure data consistency. Still as soon as we get the >>> fix, we will increase the number of PGs. >>> >>> Regards, >>> >>> Frederic. >>> >>> >>> Le 1 déc. 2016 à 16:47, Yoann Moulin a écrit : Hello, > We're impacted by this bug (case 01725311). Our cluster is running RHCS > 2.0 and is no more capable to scrub neither deep-scrub. > > [1] http://tracker.ceph.com/issues/17859 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1394007 > [3] https://github.com/ceph/ceph/pull/11898 > > I'm worried we'll have to live with a cluster that can't scrub/deep-scrub > until March 2017 (ETA for RHCS 2.2 running Jewel 10.2.4). > > Can we have this fix any sooner ? As far as I know about that bug, it appears if you have big PGs, a workaround could be increasing the pg_num of the pool that has the biggest PGs. -- Yoann Moulin EPFL IC-IT >>> >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] stalls caused by scrub on jewel
On Fri, 2 Dec 2016, Dan Jakubiec wrote: > For what it's worth... this sounds like the condition we hit we > re-enabled scrub on our 16 OSDs (after 6 to 8 weeks of noscrub). They > flapped for about 30 minutes as most of the OSDs randomly hit suicide > timeouts here and there. > > This settled down after about an hour and the OSDs stopped dying. We > have since left scrub enabled for about 4 days and have only seen three > small spurts of OSD flapping since then (which quickly resolved > themselves). Yeah. I think what's happening is that with a cold cache it is slow enough to suicide, but with a warm cache it manages to complete (although I bet it's still stalling other client IO for perhaps multiple seconds). I would leave noscrub set for now. sage > > -- Dan > > > On Dec 1, 2016, at 14:38, Frédéric Nass > > wrote: > > > > Hi Yoann, > > > > Thank you for your input. I was just told by RH support that it’s gonna > > make it to RHCS 2.0 (10.2.3). Thank you guys for the fix ! > > > > We thought about increasing the number of PGs just after changing the > > merge/split threshold values but this would have led to a _lot_ of data > > movements (1.2 billion of XFS files) over weeks, without any possibility to > > scrub / deep-scrub to ensure data consistency. Still as soon as we get the > > fix, we will increase the number of PGs. > > > > Regards, > > > > Frederic. > > > > > > > >> Le 1 déc. 2016 à 16:47, Yoann Moulin a écrit : > >> > >> Hello, > >> > >>> We're impacted by this bug (case 01725311). Our cluster is running RHCS > >>> 2.0 and is no more capable to scrub neither deep-scrub. > >>> > >>> [1] http://tracker.ceph.com/issues/17859 > >>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1394007 > >>> [3] https://github.com/ceph/ceph/pull/11898 > >>> > >>> I'm worried we'll have to live with a cluster that can't scrub/deep-scrub > >>> until March 2017 (ETA for RHCS 2.2 running Jewel 10.2.4). > >>> > >>> Can we have this fix any sooner ? > >> > >> As far as I know about that bug, it appears if you have big PGs, a > >> workaround could be increasing the pg_num of the pool that has the biggest > >> PGs. > >> > >> -- > >> Yoann Moulin > >> EPFL IC-IT > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] stalls caused by scrub on jewel
For what it's worth... this sounds like the condition we hit we re-enabled scrub on our 16 OSDs (after 6 to 8 weeks of noscrub). They flapped for about 30 minutes as most of the OSDs randomly hit suicide timeouts here and there. This settled down after about an hour and the OSDs stopped dying. We have since left scrub enabled for about 4 days and have only seen three small spurts of OSD flapping since then (which quickly resolved themselves). -- Dan > On Dec 1, 2016, at 14:38, Frédéric Nass > wrote: > > Hi Yoann, > > Thank you for your input. I was just told by RH support that it’s gonna make > it to RHCS 2.0 (10.2.3). Thank you guys for the fix ! > > We thought about increasing the number of PGs just after changing the > merge/split threshold values but this would have led to a _lot_ of data > movements (1.2 billion of XFS files) over weeks, without any possibility to > scrub / deep-scrub to ensure data consistency. Still as soon as we get the > fix, we will increase the number of PGs. > > Regards, > > Frederic. > > > >> Le 1 déc. 2016 à 16:47, Yoann Moulin a écrit : >> >> Hello, >> >>> We're impacted by this bug (case 01725311). Our cluster is running RHCS 2.0 >>> and is no more capable to scrub neither deep-scrub. >>> >>> [1] http://tracker.ceph.com/issues/17859 >>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1394007 >>> [3] https://github.com/ceph/ceph/pull/11898 >>> >>> I'm worried we'll have to live with a cluster that can't scrub/deep-scrub >>> until March 2017 (ETA for RHCS 2.2 running Jewel 10.2.4). >>> >>> Can we have this fix any sooner ? >> >> As far as I know about that bug, it appears if you have big PGs, a >> workaround could be increasing the pg_num of the pool that has the biggest >> PGs. >> >> -- >> Yoann Moulin >> EPFL IC-IT > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] stalls caused by scrub on jewel
Hi Yoann, Thank you for your input. I was just told by RH support that it’s gonna make it to RHCS 2.0 (10.2.3). Thank you guys for the fix ! We thought about increasing the number of PGs just after changing the merge/split threshold values but this would have led to a _lot_ of data movements (1.2 billion of XFS files) over weeks, without any possibility to scrub / deep-scrub to ensure data consistency. Still as soon as we get the fix, we will increase the number of PGs. Regards, Frederic. > Le 1 déc. 2016 à 16:47, Yoann Moulin a écrit : > > Hello, > >> We're impacted by this bug (case 01725311). Our cluster is running RHCS 2.0 >> and is no more capable to scrub neither deep-scrub. >> >> [1] http://tracker.ceph.com/issues/17859 >> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1394007 >> [3] https://github.com/ceph/ceph/pull/11898 >> >> I'm worried we'll have to live with a cluster that can't scrub/deep-scrub >> until March 2017 (ETA for RHCS 2.2 running Jewel 10.2.4). >> >> Can we have this fix any sooner ? > > As far as I know about that bug, it appears if you have big PGs, a workaround > could be increasing the pg_num of the pool that has the biggest PGs. > > -- > Yoann Moulin > EPFL IC-IT ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] stalls caused by scrub on jewel
On Thu, Dec 1, 2016 at 7:24 AM, Frédéric Nass < frederic.n...@univ-lorraine.fr> wrote: > > Hi Sage, Sam, > > We're impacted by this bug (case 01725311). Our cluster is running RHCS > 2.0 and is no more capable to scrub neither deep-scrub. > > [1] http://tracker.ceph.com/issues/17859 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1394007 > [3] https://github.com/ceph/ceph/pull/11898 > > I'm worried we'll have to live with a cluster that can't scrub/deep-scrub > until March 2017 (ETA for RHCS 2.2 running Jewel 10.2.4). > > Can we have this fix any sooner ? > Since this is merged in master and pending backport to jewel, I dont think you will have to wait for 2.2 to get that, It should be in a hotfix much sooner. > > Regards > > Frédéric. > > > Le 15/11/2016 à 23:35, Sage Weil a écrit : > >> Hi everyone, >> >> There was a regression in jewel that can trigger long OSD stalls during >> scrub. How long the stalls are depends on how many objects are in your >> PGs, how fast your storage device is, and what is cached, but in at least >> one case they were long enough that the OSD internal heartbeat check >> failed and it committed suicide (120 seconds). >> >> The workaround for now is to simply >> >> ceph osd set noscrub >> >> as the bug is only triggered by scrub. A fix is being tested and will be >> available shortly. >> >> If you've seen any kind of weird latencies or slow requests on jewel, I >> suggest setting noscrub and seeing if they go away! >> >> The tracker bug is >> >> http://tracker.ceph.com/issues/17859 >> >> Big thanks to Yoann Moulin for helping track this down! >> >> sage >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] stalls caused by scrub on jewel
Hello, > We're impacted by this bug (case 01725311). Our cluster is running RHCS 2.0 > and is no more capable to scrub neither deep-scrub. > > [1] http://tracker.ceph.com/issues/17859 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1394007 > [3] https://github.com/ceph/ceph/pull/11898 > > I'm worried we'll have to live with a cluster that can't scrub/deep-scrub > until March 2017 (ETA for RHCS 2.2 running Jewel 10.2.4). > > Can we have this fix any sooner ? As far as I know about that bug, it appears if you have big PGs, a workaround could be increasing the pg_num of the pool that has the biggest PGs. -- Yoann Moulin EPFL IC-IT ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] stalls caused by scrub on jewel
Hi Sage, Sam, We're impacted by this bug (case 01725311). Our cluster is running RHCS 2.0 and is no more capable to scrub neither deep-scrub. [1] http://tracker.ceph.com/issues/17859 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1394007 [3] https://github.com/ceph/ceph/pull/11898 I'm worried we'll have to live with a cluster that can't scrub/deep-scrub until March 2017 (ETA for RHCS 2.2 running Jewel 10.2.4). Can we have this fix any sooner ? Regards Frédéric. Le 15/11/2016 à 23:35, Sage Weil a écrit : Hi everyone, There was a regression in jewel that can trigger long OSD stalls during scrub. How long the stalls are depends on how many objects are in your PGs, how fast your storage device is, and what is cached, but in at least one case they were long enough that the OSD internal heartbeat check failed and it committed suicide (120 seconds). The workaround for now is to simply ceph osd set noscrub as the bug is only triggered by scrub. A fix is being tested and will be available shortly. If you've seen any kind of weird latencies or slow requests on jewel, I suggest setting noscrub and seeing if they go away! The tracker bug is http://tracker.ceph.com/issues/17859 Big thanks to Yoann Moulin for helping track this down! sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] stalls caused by scrub on jewel
Hi everyone, There was a regression in jewel that can trigger long OSD stalls during scrub. How long the stalls are depends on how many objects are in your PGs, how fast your storage device is, and what is cached, but in at least one case they were long enough that the OSD internal heartbeat check failed and it committed suicide (120 seconds). The workaround for now is to simply ceph osd set noscrub as the bug is only triggered by scrub. A fix is being tested and will be available shortly. If you've seen any kind of weird latencies or slow requests on jewel, I suggest setting noscrub and seeing if they go away! The tracker bug is http://tracker.ceph.com/issues/17859 Big thanks to Yoann Moulin for helping track this down! sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com