I'm not sure, but your logs did show that you had >16 recovery ops in flight, so it's worth a try. If it doesn't help, you should collect the same set of logs I'll look again. Also, there are a few other patches between 61.7 and current cuttlefish which may help. -Sam
On Tue, Aug 13, 2013 at 2:03 PM, Stefan Priebe - Profihost AG <[email protected]> wrote: > > Am 13.08.2013 um 22:43 schrieb Samuel Just <[email protected]>: > >> I just backported a couple of patches from next to fix a bug where we >> weren't respecting the osd_recovery_max_active config in some cases >> (1ea6b56170fc9e223e7c30635db02fa2ad8f4b4e). You can either try the >> current cuttlefish branch or wait for a 61.8 release. > > Thanks! Are you sure that this is the issue? I don't believe that but i'll > give it a try. I already tested a branch from sage where he fixed a race > regarding max active some weeks ago. So active recovering was max 1 but the > issue didn't went away. > > Stefan > >> -Sam >> >> On Mon, Aug 12, 2013 at 10:34 PM, Samuel Just <[email protected]> wrote: >>> I got swamped today. I should be able to look tomorrow. Sorry! >>> -Sam >>> >>> On Mon, Aug 12, 2013 at 9:39 PM, Stefan Priebe - Profihost AG >>> <[email protected]> wrote: >>>> Did you take a look? >>>> >>>> Stefan >>>> >>>> Am 11.08.2013 um 05:50 schrieb Samuel Just <[email protected]>: >>>> >>>>> Great! I'll take a look on Monday. >>>>> -Sam >>>>> >>>>> On Sat, Aug 10, 2013 at 12:08 PM, Stefan Priebe <[email protected]> >>>>> wrote: >>>>>> Hi Samual, >>>>>> >>>>>> Am 09.08.2013 23:44, schrieb Samuel Just: >>>>>> >>>>>>> I think Stefan's problem is probably distinct from Mike's. >>>>>>> >>>>>>> Stefan: Can you reproduce the problem with >>>>>>> >>>>>>> debug osd = 20 >>>>>>> debug filestore = 20 >>>>>>> debug ms = 1 >>>>>>> debug optracker = 20 >>>>>>> >>>>>>> on a few osds (including the restarted osd), and upload those osd logs >>>>>>> along with the ceph.log from before killing the osd until after the >>>>>>> cluster becomes clean again? >>>>>> >>>>>> >>>>>> done - you'll find the logs at cephdrop folder: >>>>>> slow_requests_recovering_cuttlefish >>>>>> >>>>>> osd.52 was the one recovering >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Greets, >>>>>> Stefan >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> the body of a message to [email protected] >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to [email protected] >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
