Re: Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)
This is a very late response ... but I did some calculations as part of some work I did a while ago: http://www.ietf.org/rfc/rfc2938.txt (See appendix A The birthday paradox) #g -- Peter Verswyvelen wrote: winds up having a write cache, which is mutable in practice. The interesting thing is that the block's location is the cryptographic hash of its contents, which leads to all sorts of neat properties (as well as requiring immutability). That's interesting. When I developed a version control system for a customer, I also used a cryptographic hash as the database key of file+content in question, but I was afraid I might have clashes (two files with different content generating the same hash)... My intuition told me that the odds of two cryptographic hashes (on meaningful content) colliding was much less than the earth being destroyed by an asteroid... But this is just intuition... What does computer science tell us about this? Thank you, Peter -- Graham Klyne Contact info: http://www.ninebynine.org/#Contact ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)
Graham Klyne wrote: This is a very late response ... but I did some calculations as part of some work I did a while ago: http://www.ietf.org/rfc/rfc2938.txt (See appendix A The birthday paradox) #g A memorable summary of the birthday paradox being : There is a 50% chance of a collision when number of digits in your population size reaches half the number of digits in the largest possible population. e.g. With a 128 bit hash you get a 50% chance of a collision when you've hashed 2^64 files etc. Richard. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)
Bulat Ziganshin wrote: Hello Peter, Thursday, January 31, 2008, 8:01:36 PM, you wrote: files with different content generating the same hash)... My intuition told me that the odds of two cryptographic hashes (on meaningful content) colliding was much less than the earth being destroyed by an asteroid... But this is just intuition... What does computer science tell us about this? you may be interested to know that widely used rsync algorithms relies on 128-bit hashes and its author speculated about its reliability: http://samba.org/~tridge/phd_thesis.pdf Interesting paper. Thank you. I had a quick read of the bit relating to hashes and my understanding is this. He uses a weak, quick and simple hash to deal with 99.99% (I invented that percentage) of cases - if the hash is different we know the files are definitely different - if the hash collides he then does a strong, slow, secure hash and relies on this as the answer. But he's relying on the strong hash rather than doing a byte by byte comparison because there is a major cost (a network transmission of the file) to doing the proper byte by byte comparison. If you had both files accessible at a low cost it might be better to byte by byte compare them when you get a collision rather than use the strong hash. The right approach may be to assume that collisions will occur and cater for this properly in the program somehow. A good tip for testing this sort of thing is to have the length of the hash (maximum size of the array or whatever you want to test) as a parameter that you can turn down to a very low value to induce collisions (overflows etc) to see whether the program still works. And then turn it back up for live use. A cryptographic hash appears as a completely random function of the input so the likelihood of a collision is simply 2^(bits in hash). Richard. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)
winds up having a write cache, which is mutable in practice. The interesting thing is that the block's location is the cryptographic hash of its contents, which leads to all sorts of neat properties (as well as requiring immutability). That's interesting. When I developed a version control system for a customer, I also used a cryptographic hash as the database key of file+content in question, but I was afraid I might have clashes (two files with different content generating the same hash)... My intuition told me that the odds of two cryptographic hashes (on meaningful content) colliding was much less than the earth being destroyed by an asteroid... But this is just intuition... What does computer science tell us about this? Thank you, Peter ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)
Hello Peter, Thursday, January 31, 2008, 8:01:36 PM, you wrote: files with different content generating the same hash)... My intuition told me that the odds of two cryptographic hashes (on meaningful content) colliding was much less than the earth being destroyed by an asteroid... But this is just intuition... What does computer science tell us about this? you may be interested to know that widely used rsync algorithms relies on 128-bit hashes and its author speculated about its reliability: http://samba.org/~tridge/phd_thesis.pdf -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)
Depending on which hash you use you can get upwards of document hask keys won't collide before the heat-death of the universe. There is of course a lot more to it than that. Google around about hashing, cryptography, and cryptographic hash functions. There are many good websites that will go into the lot more to it without having to have a degree in mathematics to understand them. -ljr Peter Verswyvelen wrote: winds up having a write cache, which is mutable in practice. The interesting thing is that the block's location is the cryptographic hash of its contents, which leads to all sorts of neat properties (as well as requiring immutability). That's interesting. When I developed a version control system for a customer, I also used a cryptographic hash as the database key of file+content in question, but I was afraid I might have clashes (two files with different content generating the same hash)... My intuition told me that the odds of two cryptographic hashes (on meaningful content) colliding was much less than the earth being destroyed by an asteroid... But this is just intuition... What does computer science tell us about this? Thank you, Peter ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe -- Lanny Ripple [EMAIL PROTECTED] ScmDB / Cisco Systems, Inc. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)
On 1/31/08, Peter Verswyvelen [EMAIL PROTECTED] wrote: winds up having a write cache, which is mutable in practice. The interesting thing is that the block's location is the cryptographic hash of its contents, which leads to all sorts of neat properties (as well as requiring immutability). That's interesting. When I developed a version control system for a customer, I also used a cryptographic hash as the database key of file+content in question, but I was afraid I might have clashes (two files with different content generating the same hash)... My intuition told me that the odds of two cryptographic hashes (on meaningful content) colliding was much less than the earth being destroyed by an asteroid... But this is just intuition... What does computer science tell us about this? Tongue firmly in cheek... Well, if you would turn to software engineering you would probably find that never ever, happens sooner than you think ;-) /M ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe