http://bugzilla.spamassassin.org/show_bug.cgi?id=2129





------- Additional Comments From [EMAIL PROTECTED]  2004-03-14 10:07 -------
No, I've been looking at performance issue in Bayes. I'm not talking about
sweeping useful tokens out of the database, although to the degree that
performance is impacted by number of unique tokens in the database, that could
be an issue. What I'm concerned about is the per message performance if a single
message had 20,000 or more unique tokens in it. My email has an average of 262
tokens per message lately. 20,000 random four character sequences adds only
100Kbytes to the length of a message, below the typical 256Kbyte limit on what
we will process in SpamAssassin, and would increase the number of tokens to look
up in the database by a factor of 100.

This is not the same as the "Bayes poisoning" that we have seen so far. I don't
think we can ignore the possibility when considering whether to make use of
I*tokens.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to