Re: Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)

2008-05-11 Thread Graham Klyne
This is a very late response ... but I did some calculations as part of some 
work I did a while ago:

  http://www.ietf.org/rfc/rfc2938.txt
(See appendix A The birthday paradox)

#g
--

Peter Verswyvelen wrote:

winds up having a write cache, which is mutable in practice.  The
interesting thing is that the block's location is the cryptographic
hash of its contents, which leads to all sorts of neat properties (as
well as requiring immutability).


That's interesting.  When I developed a version control system for a customer, 
I also used a cryptographic hash as the database key of file+content in 
question, but I was afraid I might have clashes (two files with different 
content generating the same hash)... My intuition told me that the odds of two 
cryptographic hashes (on meaningful content) colliding was much less than the 
earth being destroyed by an asteroid... But this is just intuition... What does 
computer science tell us about this?

Thank you,
Peter


--
Graham Klyne
Contact info: http://www.ninebynine.org/#Contact

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)

2008-05-11 Thread Richard Kelsall

Graham Klyne wrote:
This is a very late response ... but I did some calculations as part of 
some work I did a while ago:

  http://www.ietf.org/rfc/rfc2938.txt
(See appendix A The birthday paradox)

#g


A memorable summary of the birthday paradox being :

There is a 50% chance of a collision when number of digits in your
population size reaches half the number of digits in the largest
possible population.

e.g. With a 128 bit hash you get a 50% chance of a collision when
you've hashed 2^64 files etc.


Richard.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)

2008-02-01 Thread Richard Kelsall

Bulat Ziganshin wrote:

Hello Peter,

Thursday, January 31, 2008, 8:01:36 PM, you wrote:


files with different content generating the same hash)... My
intuition told me that the odds of two cryptographic hashes (on
meaningful content) colliding was much less than the earth being
destroyed by an asteroid... But this is just intuition... What does
computer science tell us about this?


you may be interested to know that widely used rsync algorithms relies
on 128-bit hashes and its author speculated about its reliability:

http://samba.org/~tridge/phd_thesis.pdf



Interesting paper. Thank you. I had a quick read of the bit relating to
hashes and my understanding is this. He uses a weak, quick and simple
hash to deal with 99.99% (I invented that percentage) of cases - if the
hash is different we know the files are definitely different - if the
hash collides he then does a strong, slow, secure hash and relies on
this as the answer. But he's relying on the strong hash rather than
doing a byte by byte comparison because there is a major cost (a network
transmission of the file) to doing the proper byte by byte comparison.
If you had both files accessible at a low cost it might be better to
byte by byte compare them when you get a collision rather than use
the strong hash.

The right approach may be to assume that collisions will occur and
cater for this properly in the program somehow. A good tip for testing
this sort of thing is to have the length of the hash (maximum size of
the array or whatever you want to test) as a parameter that you can
turn down to a very low value to induce collisions (overflows etc) to
see whether the program still works. And then turn it back up for live
use.

A cryptographic hash appears as a completely random function of the
input so the likelihood of a collision is simply 2^(bits in hash).


Richard.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)

2008-01-31 Thread Peter Verswyvelen
winds up having a write cache, which is mutable in practice.  The
interesting thing is that the block's location is the cryptographic
hash of its contents, which leads to all sorts of neat properties (as
well as requiring immutability).

That's interesting.  When I developed a version control system for a customer, 
I also used a cryptographic hash as the database key of file+content in 
question, but I was afraid I might have clashes (two files with different 
content generating the same hash)... My intuition told me that the odds of two 
cryptographic hashes (on meaningful content) colliding was much less than the 
earth being destroyed by an asteroid... But this is just intuition... What does 
computer science tell us about this?

Thank you,
Peter







___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)

2008-01-31 Thread Bulat Ziganshin
Hello Peter,

Thursday, January 31, 2008, 8:01:36 PM, you wrote:

 files with different content generating the same hash)... My
 intuition told me that the odds of two cryptographic hashes (on
 meaningful content) colliding was much less than the earth being
 destroyed by an asteroid... But this is just intuition... What does
 computer science tell us about this?

you may be interested to know that widely used rsync algorithms relies
on 128-bit hashes and its author speculated about its reliability:

http://samba.org/~tridge/phd_thesis.pdf


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)

2008-01-31 Thread Lanny Ripple
Depending on which hash you use you can get upwards of document 
hask keys won't collide before the heat-death of the universe.


There is of course a lot more to it than that.  Google around 
about hashing, cryptography, and cryptographic hash functions. 
There are many good websites that will go into the lot more to 
it without having to have a degree in mathematics to understand 
them.


  -ljr

Peter Verswyvelen wrote:

winds up having a write cache, which is mutable in practice.  The
interesting thing is that the block's location is the cryptographic
hash of its contents, which leads to all sorts of neat properties (as
well as requiring immutability).


That's interesting.  When I developed a version control system for a customer, 
I also used a cryptographic hash as the database key of file+content in 
question, but I was afraid I might have clashes (two files with different 
content generating the same hash)... My intuition told me that the odds of two 
cryptographic hashes (on meaningful content) colliding was much less than the 
earth being destroyed by an asteroid... But this is just intuition... What does 
computer science tell us about this?

Thank you,
Peter







___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


--
Lanny Ripple [EMAIL PROTECTED]
ScmDB / Cisco Systems, Inc.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Cryptographic hash uniquness (was [Haskell-cafe] Simple network client)

2008-01-31 Thread Magnus Therning
On 1/31/08, Peter Verswyvelen [EMAIL PROTECTED] wrote:

 winds up having a write cache, which is mutable in practice.  The
 interesting thing is that the block's location is the cryptographic
 hash of its contents, which leads to all sorts of neat properties (as
 well as requiring immutability).

 That's interesting.  When I developed a version control system for a
 customer, I also used a cryptographic hash as the database key of
 file+content in question, but I was afraid I might have clashes (two files
 with different content generating the same hash)... My intuition told me
 that the odds of two cryptographic hashes (on meaningful content) colliding
 was much less than the earth being destroyed by an asteroid... But this is
 just intuition... What does computer science tell us about this?


Tongue firmly in cheek...
Well, if you would turn to software engineering you would probably find that
never ever, happens sooner than you think ;-)

/M
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe