[Bug 7127] "bayes_seen" contains whole message parts

bugzilla-daemon Tue, 10 Feb 2015 14:08:04 -0800

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7127


--- Comment #23 from RW <[email protected]> ---
(In reply to Reindl Harald from comment #19)
> for sure bayes_learn_to_journal is not the reason, it's more the key to get
> the expected final result for whatever reason 

The reason for what? I thought we were talking about database file corruption.
Did you examine the contents of the files for visible corruption?

> Learned tokens from 9802 message(s) (10057 message(s) examined)
> Learned tokens from 10057 message(s) (10057 message(s) examined)

While this might be related to the bayes_seen corruption, I think it's at least
as likely that it isn't. It's also not clear which is the more reasonable
result.  Bayes considers two emails to be the same if the date header and the
top half of the body (up to 1024 bytes) hash to the same value. I'm seeing
about 0.4% of spams being skipped as duplicates which would correspond to 40
out of 10,000. 255 seems very high, but 0 is also suspicious.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7127] "bayes_seen" contains whole message parts

Reply via email to