On Wed, 20 Aug 2014, Guang Yang wrote:
> Thanks Greg.
> On Aug 20, 2014, at 6:09 AM, Gregory Farnum <[email protected]> wrote:
>
> > On Mon, Aug 18, 2014 at 11:30 PM, Guang Yang <[email protected]> wrote:
> >> Hi ceph-devel,
> >> David (cc?ed) reported a bug (http://tracker.ceph.com/issues/9128) which
> >> we came across in our test cluster during our failure testing, basically
> >> the way to reproduce it was to leave one OSD daemon down and in for a day,
> >> at the same time, keep giving write traffic. When the OSD daemon was
> >> started again, it hit suicide timeout and kill itself.
> >>
> >> After some analysis (details in the bug), David found that the op thread
> >> was busy searching for missing objects and once the volume to search
> >> increase, the thread is expected to work that long time, please refer to
> >> the bug for detailed logs.
> >
> > Can you talk a little more about what's going on here? At a quick
> > naive glance, I'm not seeing why leaving an OSD down and in should
> > require work based on the amount of write traffic. Perhaps if the rest
> > of the cluster was changing mappings??
> We increased the down to out time interval from 5 minutes to 2 days to
> avoid migrating data back and forth which could increase latency, so
> that we target to mark OSD out manually. To achieve such, we are testing
> against some boundary cases to let the OSD down and in for like 1 day,
> however, when we try to bring it up again, it always failed due to hit
> the suicide timeout.
Looking at the log snippet I see the PG had log range
5481'28667,5646'34066
Which is ~5500 log events. The default max is 10k. search_for_missing is
basically going to iterate over this list and check if the object is
present locally.
If that's slow enough to trigger a suicide (which it seems to be), teh
fix is simple: as Greg says we just need to make it probe the internel
heartbeat code to indicate progress. In most contexts this is done by
passing a ThreadPool::TPHandle &handle into each method and then
calling handle.reset_tp_timeout() on each iteration. The same needs to be
done for search_for_missing...
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html