https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7127
--- Comment #23 from RW <[email protected]> --- (In reply to Reindl Harald from comment #19) > for sure bayes_learn_to_journal is not the reason, it's more the key to get > the expected final result for whatever reason The reason for what? I thought we were talking about database file corruption. Did you examine the contents of the files for visible corruption? > Learned tokens from 9802 message(s) (10057 message(s) examined) > Learned tokens from 10057 message(s) (10057 message(s) examined) While this might be related to the bayes_seen corruption, I think it's at least as likely that it isn't. It's also not clear which is the more reasonable result. Bayes considers two emails to be the same if the date header and the top half of the body (up to 1024 bytes) hash to the same value. I'm seeing about 0.4% of spams being skipped as duplicates which would correspond to 40 out of 10,000. 255 seems very high, but 0 is also suspicious. -- You are receiving this mail because: You are the assignee for the bug.
