Re: [Dbmail] Re: DBMail 2.3.0 released

Daniel Vérité Tue, 18 Dec 2007 14:33:18 -0800

         Matija Grabnar writes

That is going to lead to trouble. Some years ago I had occasion tocalculate checksums
of very large number of files (looking to remove duplicates).
I discovered, to my dismay, that
a) I was getting collisions (same checksum) on files which wereobviously different (because they were different size).

With checksums, collisions are to be expected. The purpose of achecksum is to ensure that a file hasn't been damaged after beingtransmitted through a network or copied onto a medium, for example.It's not meant to identify duplicates between a large number of files.

But SHA1 is not a checksum, it's a cryptographic hash.

I'm not an expert in the field, but from reading what is available onthe web, I gather that the probability of two different files sharing"accidentally" the same SHA1 hash is believed to be about 1/2^80.


--
Daniel

PostgreSQL-powered mail user agent and storage:http://www.manitou-mail.org

_______________________________________________
DBmail mailing list
[email protected]
https://mailman.fastxs.nl/mailman/listinfo/dbmail

Re: [Dbmail] Re: DBMail 2.3.0 released

Reply via email to