Krishna Yenduri wrote: > On 10/15/09 11:17 AM, Lori Alt wrote: >> I presented the question :"Are SHA256 questions good enough to >> establish block equality?" to Jeff Bonwick. His answer: >> >>> Yes. Collision probability is 10-77, i.e. 77 nines. Nothing else >>> in a computer is even close to that reliable. > > Note that the probability of a collision also depends on the number of > blocks > in the stream. For example, one would need to do 2^128 SHA256 digests to > get a probability of a collision > 0.5. > > There is a nice table at > http://en.wikipedia.org/wiki/Birthday_paradox#Probability_table > that gives the upper bound on the number of blocks to achieve > a given probability. > > I would agree that this is a reliable way to establish block equality > given the number of blocks needed for even a probability of 10^-18. >
Perhaps it's worth pointing out that both statements above are correct, but they are answers to different questions. 10^-77 is the probability of a hash collision for a particular pair of blocks. For ZFS, we care if there is a collision between *any* pair of unequal blocks. That probability depends on the number of blocks, as Krishna points out. Finally, both of these calculations rely upon the implicit assumption that the 2^256 possible hash values are uniformly distributed; that assumption is widely accepted to be at least approximately true, but I'm not aware of a mathematical proof. In any case, I think it's safe to conclude that SHA-256 is more than adequate for filesystem block equality comparisons. Scott -- Scott Rotondo Principal Engineer, Solaris Security Technologies President, Trusted Computing Group Phone/FAX: +1 408 850 3655 (Internal x68278)