On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote:
> Right, well I'm vaguely curious why ZFS, as different as it is,
> basically take the position that if the hardware went so batshit that
> they can't unwind it on a normal mount, then an fsck probably can't
> help either... they still don't have an fsck and don't appear to want
> one.

ZFS has no automated fsck, but it does have a kind of interactive
debugger that can be used to manually fix things.

ZFS seems to be a lot more robust when it comes to handling bad metadata
(contrast with btrfs-style BUG_ON panics).

When you delete a directory entry that has a missing inode on ZFS,
the dirent goes away.  In the ZFS administrator documentation they give
examples of this as a response in cases where ZFS metadata gets corrupted.

When you delete a file with a missing inode on btrfs, something
(VFS?) wants to check the inode to see if it has attributes that might
affect unlink (e.g. the immutable bit), gets an error reading the
inode, and bombs out of the unlink() before unlink() can get rid of the
dead dirent.  So if you get a dirent with no inode on btrfs on a large
filesystem (too large for btrfs check to handle), you're basically stuck
with it forever.  You can't even rename it.  Hopefully it doesn't happen
in a top-level directory.

ZFS is also infamous for saying "sucks to be you, I'm outta here" when
things go wrong.  People do want ZFS fsck and defrag, but nobody seems
to be bothered much about making those things happen.

At the end of the day I'm not sure fsck really matters.  If the filesystem
is getting corrupted enough that both copies of metadata are broken,
there's something fundamentally wrong with that setup (hardware bugs,
software bugs, bad RAM, etc) and it's just going to keep slowly eating
more data until the underlying problem is fixed, and there's no guarantee
that a repair is going to restore data correctly.  If we exclude broken
hardware, the only thing btrfs check is going to repair is btrfs kernel
bugs...and in that case, why would we expect btrfs check to have fewer
bugs than the filesystem itself?

> I'm not sure if the brfsck is really all that helpful to user as much
> as it is for developers to better learn about the failure vectors of
> the file system.

ReiserFS had no working fsck for all of the 8 years I used it (and still
didn't last year when I tried to use it on an old disk).  "Not working"
here means "much less data is readable from the filesystem after running
fsck than before."  It's not that much of an inconvenience if you have
backups.

Attachment: signature.asc
Description: Digital signature

Reply via email to