On Fri, 18 Feb 2005 08:36:51 EST, Gregory Maxwell said: > Tree hashes. > Divide the file into blocks of N bytes. Compute size/N hashes. > Group hashes into pairs. Compute N/2 N' hashes, this is fast because > hashes are small. Group N' hashes into pairs compute N'/2 N'' hashes > etc.. Reduce to a single hash.
You get massively I/O bound real fast this way. You may want to re-evaluate whether this *really* buys you anything, especially if you're not using some sort of guarantee that you know what's actually b0rked... > In my initial suggestion I offered that hashes could be verified by a > userspace daemon, or by fsck (since it's an expensive operation)... > Such policy could be controlled in the daemon. > In most cases I'd like it to make the file inaccessible until I go and > fix it by hand. You're still missing the point that in general, you don't have a way to tell whether the block the file lived in went bad, or the block the hash lived in went bad. Sure, if the file *happens* to be ascii text, you can use Wetware 1.5 to scan the file and tell which one went bad. However, you'll need Wetware 2.0 to do the same for your multi-gigabyte Oracle database... :) (And yes, I *have* seen cases where Tripwire went completely and totally bananas and claimed zillions of files were corrupted, when the *real* problem was that the Tripwire database itself had gotten stomped on - so it's *not* a purely theoretical issue....
pgpk0wA71b8oV.pgp
Description: PGP signature
