On Fri, Mar 7, 2008 at 12:09 AM, Charles Forsyth <[EMAIL PROTECTED]> wrote:
> > But for HA applications, we still need some additional redundancy
>  > or at least some error diagnostics at application level. Well,
>  > we'll most likely needs this anyways, eg. to detect human fault
>  > or code bugs.
>
>  i hadn't realised the code i'd quoted only dealt with blocks in memory
>  (i didn't look hard enough once i'd found it), but russ then pointed out
>  that another option will do something like the check i'd intended.
>
>  given that, you have at least a check and a diagnostic that the
>  unlikely event ocurred.  it isn't the case i'd worry about first.  after 
> all, the applications
>  pull the stuff into memory across interfaces that might have at most a parity
>  check, after transmission using protocols that use a fairly simple 16-bit
>  check sum, a compromise between speed of calculation and effectiveness.
>  one might sometimes add an end-to-end check, or digesting ... perhaps using 
> SHA1!

 The difference between this and venti (aside from the factor of 2^60
or whatever it was) is that network/memory/disk errors are either
transient or managable.
 Silent network error? Going to be difficult to notice, but once you
do a retransmit will fix it (or if things are really bad, a
replacement network card).
 RAM Problems? If transient, it is fixed next reboot, otherwise
replace the module.
 Silent disk corruption? Rewrite the data or replace the disk.
 Venti hash collision? Um... well, it doesn't matter how many times we
try to rewrite the block in question, it is always going to collide.
Replacing venti seems less than satisfactory - what else provides the
same functionality? Our best option is to replace the hash and hope we
don't get a different collision. But, this leaves us with a whole
bunch of data addressed by the old hashing scheme which we presumably
have to write new code to convert[1]. New code means new bugs, and I'd
be lying if I claimed the prospect of writing such a utility to run on
several years of a venti archive didn't scare me.

[1] Unless you could do this with vac and co... my venti-fu is weak.
I'm setting my file server up soon, I promise!

 But if I normalise my worries based on the likelihood of the problem
occuring, then the real thing leaving a bad taste in my mouth is that
eventually something happens to force maintenance:
1) you get a hash collision
2) something displaces venti
3) venti changes
 OTOH, eventually you're going to run out of disk space, so venti is
unlikely to be the weak link here either.

 Well, I came up with one perhaps more interesting question while
thinking about what happens with different block sizes (in particular
blocks of one byte and blocks of the same size as the hash)... As I
understand it, venti uses the hash of the data to determine where on
disk to store the block. So, what happens when the hash resolves to an
address which is off the end of the disk?

-sqweek

Reply via email to