On Sun, Sep 06, 2020 at 06:08:51PM -0400, Mason Loring Bliss wrote: > On Sat, Sep 05, 2020 at 01:41:46PM -0400, Hendrik Boom wrote: > > > Nowadays the hardware replaces individual bad blocks without bothering > > the file system. > > Where it can, yeah. That said, I've seen some of the corruption that we're > supposed to never see - the bitflips in files that people use to > demonstrate self-healing filesystems - prior to my become a ZFS zealot. > > The awfully nice thing about ZFS is that if you have a mirror or better, > each drive stores both data and a checksum of that data, so you have an > awfully good chance of finding one bit of recorded data that matches one > checksum, and if you have that, ZFS can rewrite all the bad data. Even with > a single disk, you can specify multiple copies to achieve the same thing, > although catastrophic failure of the disk is always a possibility, making a > proper mirror *and* back-ups preferable. > > As a random note, the upstream ZFS custom package instructions work out of > the box on Devuan, and they still ship sysvinit files when built that way. > > > https://openzfs.github.io/openzfs-docs/Developer%20Resources/Custom%20Packages.html > > At some point there will be other filesystems that do the same. BtrFS isn't > far behind, and hopefully some increased attention will get it the rest of > the way. Red Hat's Stratis and DragonflyBSD's Hammer2 will both have self- > healing working before long, using different approaches.
There are also the md RAID machanisms, which do the duplication but not the checksumming. So if the hardware can't read a block, there's another copy on the mirror. But the lack of checksumming makes it have to rely on the hardware for error detection. It's an alternative if for some reason btrfs and xfs aren't suitable. (in my case, they weren't available in the dark ages when I first created my file systems) And I believe btrfs and zfs rely on the hardware implementing write blocks correctly. I've heard of hard disks that fail to do that properly. Those drives treat data as having been permanently recorded once they are in cache, instead of waiting until the data are actually on the hard disk surface itself. This causes trouble on unexpected power-down. I don't know if any such hard drives are still manufactured. I hope not. And are those file systems good enough for media where blocks degenerate slightly each time they are written to? The journals get a lot of write activity. The copy-on-write mechanisms are the reason those file systems have their legendary stability. But it also makes them vulnerable to errors in RAM. When stuff is read, part of it is modified, and then written back (via the journal) it has a sojourn on RAM, where there is the potential for corruption. -- hendrik > > -- > Mason Loring Bliss (( If I have not seen as far as others, it is because > [email protected] )) giants were standing on my shoulders. - Hal Abelson _______________________________________________ Dng mailing list [email protected] https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng
