On Thu, Sep 15, 2016 at 01:02:43PM -0600, Chris Murphy wrote: > Right, well I'm vaguely curious why ZFS, as different as it is, > basically take the position that if the hardware went so batshit that > they can't unwind it on a normal mount, then an fsck probably can't > help either... they still don't have an fsck and don't appear to want > one.
ZFS has no automated fsck, but it does have a kind of interactive debugger that can be used to manually fix things. ZFS seems to be a lot more robust when it comes to handling bad metadata (contrast with btrfs-style BUG_ON panics). When you delete a directory entry that has a missing inode on ZFS, the dirent goes away. In the ZFS administrator documentation they give examples of this as a response in cases where ZFS metadata gets corrupted. When you delete a file with a missing inode on btrfs, something (VFS?) wants to check the inode to see if it has attributes that might affect unlink (e.g. the immutable bit), gets an error reading the inode, and bombs out of the unlink() before unlink() can get rid of the dead dirent. So if you get a dirent with no inode on btrfs on a large filesystem (too large for btrfs check to handle), you're basically stuck with it forever. You can't even rename it. Hopefully it doesn't happen in a top-level directory. ZFS is also infamous for saying "sucks to be you, I'm outta here" when things go wrong. People do want ZFS fsck and defrag, but nobody seems to be bothered much about making those things happen. At the end of the day I'm not sure fsck really matters. If the filesystem is getting corrupted enough that both copies of metadata are broken, there's something fundamentally wrong with that setup (hardware bugs, software bugs, bad RAM, etc) and it's just going to keep slowly eating more data until the underlying problem is fixed, and there's no guarantee that a repair is going to restore data correctly. If we exclude broken hardware, the only thing btrfs check is going to repair is btrfs kernel bugs...and in that case, why would we expect btrfs check to have fewer bugs than the filesystem itself? > I'm not sure if the brfsck is really all that helpful to user as much > as it is for developers to better learn about the failure vectors of > the file system. ReiserFS had no working fsck for all of the 8 years I used it (and still didn't last year when I tried to use it on an old disk). "Not working" here means "much less data is readable from the filesystem after running fsck than before." It's not that much of an inconvenience if you have backups.
signature.asc
Description: Digital signature