On 2016-09-19 00:08, Zygo Blaxell wrote:
I wouldn't, but I would still expect to have some tool to deal with
things like orphaned inodes, dentries which are missing inodes, and
other similar cases that don't make the filesystem unusable, but can't
easily be fixed in a sane manner on a live filesystem. The ZFS approach
is valid, but it can't deal with things like orphaned inodes where
there's no reference in the directories any more.
On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote:
Right, well I'm vaguely curious why ZFS, as different as it is,
basically take the position that if the hardware went so batshit that
they can't unwind it on a normal mount, then an fsck probably can't
help either... they still don't have an fsck and don't appear to want
ZFS has no automated fsck, but it does have a kind of interactive
debugger that can be used to manually fix things.
ZFS seems to be a lot more robust when it comes to handling bad metadata
(contrast with btrfs-style BUG_ON panics).
When you delete a directory entry that has a missing inode on ZFS,
the dirent goes away. In the ZFS administrator documentation they give
examples of this as a response in cases where ZFS metadata gets corrupted.
When you delete a file with a missing inode on btrfs, something
(VFS?) wants to check the inode to see if it has attributes that might
affect unlink (e.g. the immutable bit), gets an error reading the
inode, and bombs out of the unlink() before unlink() can get rid of the
dead dirent. So if you get a dirent with no inode on btrfs on a large
filesystem (too large for btrfs check to handle), you're basically stuck
with it forever. You can't even rename it. Hopefully it doesn't happen
in a top-level directory.
ZFS is also infamous for saying "sucks to be you, I'm outta here" when
things go wrong. People do want ZFS fsck and defrag, but nobody seems
to be bothered much about making those things happen.
At the end of the day I'm not sure fsck really matters. If the filesystem
is getting corrupted enough that both copies of metadata are broken,
there's something fundamentally wrong with that setup (hardware bugs,
software bugs, bad RAM, etc) and it's just going to keep slowly eating
more data until the underlying problem is fixed, and there's no guarantee
that a repair is going to restore data correctly. If we exclude broken
hardware, the only thing btrfs check is going to repair is btrfs kernel
bugs...and in that case, why would we expect btrfs check to have fewer
bugs than the filesystem itself?
For a small array, this may be the case. Once you start looking into
double digit TB scale arrays though, restoring backups becomes a very
expensive operation. If you had a multi-PB array with a single dentry
which had no inode, would you rather be spending multiple days restoring
files and possibly losing recent changes, or spend a few hours to check
the filesystem and fix it with minimal data loss?
I'm not sure if the brfsck is really all that helpful to user as much
as it is for developers to better learn about the failure vectors of
the file system.
ReiserFS had no working fsck for all of the 8 years I used it (and still
didn't last year when I tried to use it on an old disk). "Not working"
here means "much less data is readable from the filesystem after running
fsck than before." It's not that much of an inconvenience if you have
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html