http://bugzilla.spamassassin.org/show_bug.cgi?id=3055





------- Additional Comments From [EMAIL PROTECTED]  2004-02-18 13:17 -------
Subject: Re:  Bayes: use hash instead of Message-Id?

On Tue, Feb 17, 2004 at 08:08:57PM -0800, [EMAIL PROTECTED] wrote:
> 1. overhead of computing the hash (not a big deal, I think)

I'm not worried about it.

> 2. stability of the hash to minor changes (like whitespace in headers,
>    whitespace at end of body, header sorting, Received headers, etc.)
>    that could cause a mismatch in generated ID from one hashing to the
>    next.

Well, the current hash we use is semi-resistent to changes:

    # Use sha1(Date:, last received: and top N bytes of body)
    # where N is MIN(1024 bytes, 1/2 of body length)

The Date: header shouldn't change between systems, the last received
header (the first one added to the message) shouldn't change, and the
top N bytes of the pristine body, theoretically, shouldn't change.

This is the hash we do now if there is no Message-Id header.  Do we think
this is fine?  If so, I'll make the changes necessary to make it default.

> 3. backward compatibility with existing Bayes databases.

Doable.  Just need make the seen checks look for msgid or hash.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to