On Tue, May 30, 2017 at 09:12:39AM -0700, Sargun Dhillon wrote:
> We've been running BtrFS for a couple months now in production on
> several clusters. We're running on Canonical's 4.8 kernel, and
> currently, in the process of moving to our own patchset atop vanilla
> 4.10+. I'm glad to say it's been a fairly good experience for us. Bar
> some performance issues, it's been largely smooth sailing.

Yay, thanks for the feedback.

> There has been one class of persistent issues that has been plaguing
> our cluster is deadlocks. We've seen a fair number of issues where
> there are some number of background threads and user threads are in
> the process of performing operations where some are waiting to start a
> transaction, and at least one background thread or user thread is in
> the process of committing a transaction. Unfortunately, these
> situations are ending in deadlocks, where no threads are making
> progress.

In such situations, save stacks of all processes (/proc/PID/stack). I
don't want to play terminology here, so by a deadlock I could understand
a system that's making progress so slow that's effectively stuck. This
could happen if the files are freamgented, so eg. traversing extents
takes locks and has a lot of work before it unlocks. Add some extent
sharing and updating references, this adds some points where the threads
just wait.

The stacktraces could give an idea of what kind of hang it is.

> We've talked about a couple ideas internally, like adding the ability
> to timeout transactions, abort commits or start_transactions which are
> taking too long, and adding more debugging to get insights into the
> state of the filesystem. Unfortunately, since our usage and knowledge
> of BtrFS is still somewhat nascent, we're unsure of what is the right
> investment.

There's a kernel-wide hung task detection, but I think a similar
mechanism around just the transaction commits would be useful, as a
debugging option.

There are number of ways how a transaction can be blocked though, so
we'd need to choose the starting point. Extent-related locks, waiting
for writes, other locks, the intenral transactional logic (and possibly
more).

> I'm curious, are other people seeing deadlocks crop up in production
> often? How are you going about debugging them, and are there any good
> pieces of advice on avoiding these for production workloads?

I have seen hangs with kernel 4.9 a while back triggered by a
long-running iozone stress test, but 4.8 was not affected, and 4.10+
worked fine again. I don't know if/which btrfs patches the 'canonical
4.8' kernel has, so this might not be related.

As for deadlocks (double taken lock, lock inversion), I haven't seen
them for a long time. The testing kernels run with lockdep, so we should
be able to see them early. You could try to run turn lockdep on if the
performance penalty is still acceptable for you.  But there are still
cases that lockdep does not cover IIRC, due to the higher-level
semantics of the various btrfs trees and locking of extent buffers.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to