https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4900
Adam Katz <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] --- Comment #12 from Adam Katz <[email protected]> 2010-04-15 18:34:31 EDT --- (In reply to comment #4 / Bug 6032, Ian Turner 2008-12-18) > For example, consider a message sender [email protected], whose messages > are flagged as spam with score 30. Assume the system is configured with a > spam threshold of 10. Finally, assume an administrator runs spamassassin > [email protected] and that several messages are > then recieved from this source. We will see the following behaviour: > > Message pre-AWL post-AWL count totscore Message accepted? > score score > 1 -100 > 1 30 -35 2 -70 TRUE > 2 30 -20 3 -40 TRUE > 3 30 -5 4 -10 TRUE > 4 30 10 5 20 FALSE > 5 30 25 6 50 FALSE > > As it turns out, the --add-addr-to-whitelist command was only good for > three messages. I ran this test on vanilla 3.3.0 and 3.3.1 installs to verify, my numbers differed a bit (for the worse!). I tried it at three different auto_whitelist_factor values; the default of 0.5 (implict and explicit, both were the same), 0.75, and 1.0. Tests were performed using a vanilla email with a custom rule assigning 30 points to that email's Message-ID string. I trained SA via `spamassassin -W <test.eml`), then successively scanned the email with `spamassassin -D auto-whitelist <test.eml |grep score:` The only value that changes with the auto_whitelist_factor is the post-AWL score. This is also the only difference my results have with Ian's in comment 4, so I'm only presenting the post-AWL scores at the three factors I tested. My results: ------- AWL factor ------- Message 0.5 0.75 1.0 1 -35 -67.5 -100 2 -2.5 -18.75 -35 3 8.333 -2.5 -13.333 4 13.75 5.625 -2.5 5 17 10.5 4 6 19.167 13.75 8.333 7 20.714 16.071 11.429 8 21.875 17.812 13.75 9 22.778 19.167 15.556 At a factor of 1.0, AWL brings the score to the previous average as specified in the documentation, which is handy for checking the math. Like Ian's results, my test at factor=0.5 results in the sender getting flagged as spam on the third email following a whitelist training. Even at factor=1.0, there are only five emails in the clear. Here's another view of the issue, fixed at AWL factor 0.5 but with varying initial scores and learning as ham or spam: ---------------- initial score ----------------- Message h...@30 h...@20 h...@10 s...@0 s...@-5 s...@-10 1 -35 -40 -45 50 47.5 45 2 -2.5 -10 -17.5 25 21.3 17.5 3 **8.3** 0 -8.3 16.7 12.5 8.3 4 13.8 **5** -3.8 12.5 8.1 **3.8** 5 17 8 -1 10 5.5 1 6 19.2 10 0.8 8.3 **3.8** -0.8 7 20.7 11.4 2.1 7.1 2.5 -2.1 8 21.9 12.5 3.1 6.3 1.6 -3.1 9 22.8 13.3 3.9 5.6 0.8 -3.9 10 23.5 14 4.5 5 0.3 -4.5 11 24.1 14.5 **5** **4.5** -0.2 -5 The turnover counts are the notable thresholds here. A ham scoring 30 bounces back to getting marked as spam on the third message. Ham at 20 takes just one more. Ham at 10 turns over on the 11th message. I didn't put a 5 point ham on the chart, but it's fine for quite a while (it hits 4.0 on the 53rd message and 4.5 on the 105th). On the spam side, a spam that somehow gets to -10 evades detection on its fourth message. A spam at -5 returns to the inbox on the sixth. A zero-scoring spam is snuffed for ten iterations, returning on the 11th. Not on the chart, a spam scoring 2 comes back on the 17th message and a spam scoring 4.5 dips under 6 on its 32nd, under 5.5 after 48, and gets out of jail on its 94th. Method: change the value of my local rule and then run: spamassassin --add-to-blacklist <~/Mail/test.eml >/dev/null; for a in `seq 1 105`; do spamassassin -D auto-whitelist <~/Mail/test.eml 2>&1 |sed -re '/.*post.*score: /!d' -e "s// $a\t/"; done (or swap --add-to-blacklist with -W) -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
