HvR,

> > run md5sum on the mail message body and store the resulting string in
> > a file then compare each message against this list in the file, if the
> > md5sums of the message body are the same then the message is
> > guaranteed to be the same.
> 
> Nope.

Calming down and reading RFC 1321...


> If the md5sum hashes are different, the messages are guaranteed to be
> different. If the hashes are the same, there is always a slight
> probability, that the messages are *NOT* the same.
> 
> With a limited length of hash value, you cannot guaranteed distinct
> longer data chunks.

The MD5 algorithm indeed is designed for the mentioned purpose -- to
"reliably" identify mails by a short checksum. And it is very wide used
for this purpose.

So you are very right.


The only thing that triggered me, was the guarantee:

As md5sum is limited to 128 bits, there are only 2^128 different
fingerprints and therefore feeding 2^128 + 1 different messages will
produce at least 1 fingerprint to be associated with 2 different mails.

...guenther


-- 
char *t="[EMAIL PROTECTED]";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

_______________________________________________
evolution maillist  -  [EMAIL PROTECTED]
http://lists.ximian.com/mailman/listinfo/evolution

Reply via email to