On 04.11.18 19:31, Duncan wrote:
[This mail was also posted to gmane.comp.file-systems.btrfs.]

Sebastian Ochmann posted on Sun, 04 Nov 2018 14:15:55 +0100 as
excerpted:

Hello,

I have a btrfs filesystem on a single encrypted (LUKS) 10 TB drive
which stopped working correctly.

Kernel 4.18.16 (Arch Linux)

I see upgrading to 4.19 seems to have solved your problem, but this is
more about something I saw in the trace that has me wondering...

[  368.267315]  touch_atime+0xc0/0xe0

Do you have any atime-related mount options set?

That's an interesting point. On some machines, I have explicitly set "noatime", but on that particular system, I did not, thus using the "relatime" option as per default. Since I'm not using mutt or anything else (that I'm aware of) that exploits this feature, I will set noatime there as well.

FWIW, noatime is strongly recommended on btrfs.

Now I'm not a dev, just a btrfs user and list regular, and I don't know
if that function is called and just does nothing when noatime is set,
so you may well already have it set and this is "much ado about
nothing", but the chance that it's relevant, if not for you, perhaps
for others that may read it, begs for this post...

The problem with atime, access time, is that it turns most otherwise
read- only operations into read-and-write operations in ordered to
update the access time.  And on copy-on-write (COW) based filesystems
such as btrfs, that can be a big problem, because updating that tiny
bit of metadata will trigger a rewrite of the entire metadata block
containing it, which will trigger an update of the metadata for /that/
block in the parent metadata tier... all the way up the metadata tree,
ultimately to its root, the filesystem root and the superblocks, at the
next commit (normally every 30 seconds or less).

Not only is that a bunch of otherwise unnecessary work for a bit of
metadata barely anything actually uses, but forcing most read
operations to read-write obviously compounds the risk for all of those
would-be read- only operations when a filesystem already has problems.

Additionally, if your use-case includes regular snapshotting, with
atime on, on mostly read workloads with few writes (other than atime
updates), it may actually be the case that most of the changes in a
snapshot are actually atime updates, making reoccurring snapshot
updates far larger than they'd be otherwise.

Now a few years ago the kernel did change the default to relatime,
basically updating the atime for any particular file only once a day,
which does help quite a bit, and on traditional filesystems it's
arguably a reasonably sane default, but COW makes atime tracking enough
more expensive that setting noatime is still strongly recommended on
btrfs, particularly if you're doing regular snapshotting.

So do consider adding noatime to your mount options if you haven't done
so already.  AFAIK, the only /semi-common/ app that actually uses
atimes these days is mutt (for read-message tracking), and then not for
mbox, so you should be safe to at least test turning it off.

And YMMV, but if you do use mutt or something else that uses atimes,
I'd go so far as to recommend finding an alternative, replacing either
btrfs (because as I said, relatime is arguably enough on a traditional
non-COW filesystem) or whatever it is that uses atimes, your call,
because IMO it really is that big a deal.

Meanwhile, particularly after seeing that in the trace, if the 4.19
update hadn't already fixed it, I'd have suggested trying a read-only
mount, both as a test, and assuming it worked, at least allowing you to
access the data without the lockup, which would have then been related
to the write due to the atime update, not the actual read.

It would be nice to have a 1:1 image of the filesystem (or rather the raw block device) for more testing, but unfortunately I don't have another 10 TB drive lying around. :) I didn't really expect the 4.19 upgrade to (apparently) fix the problem right away, so I also couldn't test the mentioned patch, but yeah... If it happens again (which for some reason I don't hope), I'll try you suggestion.

Actually, a read-only mount test is always a good troubleshooting step
when the trouble is a filesystem that either won't mount normally, or
will, but then locks up when you try to access something.  It's far
lest risky than a normal writable mount, and at minimum it provides you
the additional test data of whether it worked or not, plus if it does,
a chance to access the data and make sure your backups are current,
before actually trying to do any repairs.

Reply via email to