http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5497
------- Additional Comments From [EMAIL PROTECTED] 2007-06-07 01:47 ------- > The bayes algorithms take this into account, so it > should be compensated for just fine It should compensate for different absolute numbers of ham vs spam in the collection, but it can't compensate for a collection process that biases against some class of ham. For example, consider that all Outlook Express mail that contains embedded graphics in HTML as cid MIME objects trigger the EXTRA_MPART_TYPE rule for 1.0 point. No ham that has that will be autolearned and all high scoring spam that has that will. If there are any tokens that are characteristic of that kind of mail, the effect will be to amplify the EXTRA_MPART_TYPE FP from producing just 1 extra point to producing 1 plus a high score from Bayes. That's how I interpreted what is going on here. The summary describes that kind of amplification of FPs on tokens found in MS Word generated HTML. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
