Hi!

25.03.2014 12:51, Rob Vesse kirjoitti:

Regardless of the hash function used there is always a collision
probability.  SDB uses MD5 which has a probability of approximately
2^20.96 according to
http://en.wikipedia.org/wiki/Comparison_of_cryptographic_hash_functions#Cry
ptanalysis so approximately 1 in 2 million

I think this number applies only for deliberate attempts to generate a collision (the table heading on Wikipedia says "Best known attacks"). The probability for coincidental collisions, which I hope is the more relevant case here, should be much lower, probably closer to 2^128 which is the MD5 digest size. Otherwise you would likely get a hash collision with a SDB having only a few million triples.

-Osma

--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Teollisuuskatu 23)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
[email protected]
http://www.nationallibrary.fi

Reply via email to