[redirected to [EMAIL PROTECTED]

>  The difference between this and venti (aside from the factor of 2^60
> or whatever it was) is that network/memory/disk errors are either

brushing off the factor of 2^60 is like brushing off the difference
in weight between a bunch of banannas and the moon.

> transient or managable.
>  Silent network error? Going to be difficult to notice, but once you
> do a retransmit will fix it (or if things are really bad, a
> replacement network card).
>  RAM Problems? If transient, it is fixed next reboot, otherwise
> replace the module.
>  Silent disk corruption? Rewrite the data or replace the disk.
>  Venti hash collision? Um... well, it doesn't matter how many times we
> try to rewrite the block in question, it is always going to collide

what's the difference?  you are assuming that you can recreate the destroyed
data.  you're also assuming that a corrupt hash was stored with the corrupt
data.  if a proper hash is stored with corrupt data, you will never be able to
store that same block correctly without venti surgery.

> Replacing venti seems less than satisfactory - what else provides the
> same functionality? Our best option is to replace the hash and hope we
> don't get a different collision. But, this leaves us with a whole
> bunch of data addressed by the old hashing scheme which we presumably
> have to write new code to convert[1]. New code means new bugs, and I'd
> be lying if I claimed the prospect of writing such a utility to run on
> several years of a venti archive didn't scare me.

this is a common fallacy.  being scared of "what if" doesn't change 
probabilities.
oddly people are not wired to be afraid of the things they should.  how
many people do you know who get white-knuckled at the thought of getting
into a car?  you're >1000x more likely to die in a car than an airplane.
in fact you're 10^22 times more likely to die in a car crash than to have
a venti collision.  (at least in the us.)

by the way, i'm not sure what you mean by "replace venti".  there isn't
anything that does content-addressed storage for plan 9.  but if you
mean, is there anything that does the same job as venti+fossil, there is.
i've been using ken's file server with aoe storage.  the data is protected
by raid on the storage appliances.  this allows us to get very good fs
throughput although our working set is generally >4GB.

>  Well, I came up with one perhaps more interesting question while
> thinking about what happens with different block sizes (in particular
> blocks of one byte and blocks of the same size as the hash)... As I
> understand it, venti uses the hash of the data to determine where on
> disk to store the block. So, what happens when the hash resolves to an
> address which is off the end of the disk?

not really.  a direct addressing scheme would require 1.46e48 bytes of
storage.  venti uses a hash of the data like a normal disk would use
an lba.  this hash is called a fingerprint.  the fingerprints are indexed.
the index provides a mapping between fingerprint and arena:offset.

- erik


Reply via email to