On 7/7/14, 2:41 PM, Neal H. Walfield wrote: > Hi, > > It is possible for a bit-flip to change data that is to be written to > disk after its hash has been computed, but before it has been sent to > disk. This is primarily a concern for systems without ECC RAM. > > It is possible to correct (some of) these errors by including some > forward error correction bits in the hash (or, perhaps next to it, but > we should include some FEC bits for the hash itself since it too could > be corrupted). It wouldn't have to be more than a few bits, since we > expect at most a bit flip or two for any given block of data. > > Would there by interest in such an extension? > > It is conceivable that a lot of redundancy could be useful: such a > scheme could correct bad blocks on disk. This would primarily be > useful on systems with just a single drive (e.g., a laptop) or when > resilvering a mirror vdev and the remain disk has a block block (this, > unfortunately has happened to me). > > Thoughts? > > (If this is the wrong place for such questions, please tell where to > post instead.) > > Thanks! > > :) Neal
While it'd relatively straighforward to modify ZFS to support a full FEC in its checksum field, I don't see much reason for it for primarily the reason that single-bit or few-bit errors on disk are exceedingly rare (in fact I or anybody I've talked to haven't seen any). Corruption from storage media seems to come in two flavors: 1) Read errors when the drive's ECC detected the error but couldn't correct it. This is equivalent to corruption in all the bits of the block, which nothing besides a fully redundant copy can fix (i.e. copies=2 or a mirror). 2) Massive corruption when the bit errors exceed the drive's ECC's _detection_ threshold (hamming distance greater than 2x the error correction threshold) and so most likely would overwhelm ours as well (256 bits of ECC for 1 megabit or 0.02% of data isn't exactly much - your typical FEC transmission scheme usually reserves around 10-20% of raw channel bandwidth for FEC), or misdirected reads/writes, which also results with an overwhelming probability in a block that's massively different. So while I can see merit in the idea, I don't see much practical need for it. But perhaps I'm not seeing something, so please do correct me if I'm wrong. Cheers, -- Saso _______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
