On Thu, 17 Feb 2005 21:43:08 CST, David Masover said: > This way is easier, though. But I was thinking about accessing the > file. I don't know of any hashes that can be easily updated from part > of the file, unless you're hashing only pieces of the file in the first > place, but it'd be nice to not bother hashing at all until the hash is > needed, especially if we are hashing the whole file.
There's plenty of CRC functions that are quite easily set up for an incremental update (see RFCs 1141 and 1624 on how to do it for the CRC function used for Internet IP packets). You'd of course not want to use that CRC-16, but the same basic principle applies to other CRC functions. The problem is that most CRC functions aren't very much good at detecting multi-bit errors, and when you're talking about hundreds of gigabytes of disk on a modern RAID, the CRC functions are hardly bulletproof. On the flip side, hash functions like MD5 or the SHA family are fairly bulletproof, but are essentially impossible to develop an incremental update for (if there existed a fast incremental update for the hash function, that would imply a very low preimage resistance, rendering it useless as a cryptographic hash). Also, there's another issue - unlike standard ECC codes that can actually *fix* the problem (for at least small number of bit errors), it's unclear what you should do if you find a mismatch between the hash of a block and the block contents, as you don't know whether it's the actual data or the hash that's corrupted....
pgppfOUk0kfEV.pgp
Description: PGP signature
