https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6155

--- Comment #162 from Warren Togami <[email protected]> 2009-11-16 21:28:44 
UTC ---
fp-fn-statistics across the entire "rescore" logs.

Set 3 Before
===========
# SUMMARY for threshold 5.0:
# Correctly non-spam: 703647  99.90%
# Correctly spam:     2559525  98.28%
# False positives:       719  0.10%
# False negatives:     44795  1.72%
# TCR(l=50): 32.253638  SpamRecall: 98.280%  SpamPrec: 99.972%

Set 3 Raw Rescoring from Comment #146
==================================
# SUMMARY for threshold 5.0:
# Correctly non-spam: 703520  99.88%
# Correctly spam:     2548134  97.84%
# False positives:       846  0.12%
# False negatives:     56186  2.16%
# TCR(l=50): 26.443555  SpamRecall: 97.843%  SpamPrec: 99.967%

Doesn't look like an improvement.

Set 3 + Rescore + Reductions
==========================
# SUMMARY for threshold 5.0:
# Correctly non-spam: 704002  99.95%
# Correctly spam:     2558896  98.26%
# False positives:       364  0.05%
# False negatives:     45424  1.74%
# TCR(l=50): 40.932981  SpamRecall: 98.256%  SpamPrec: 99.986%

Looks like a statistically insignificant improvement over the old scores.  I
only hope our corpora was sufficiently varied.

Rules Made Informational
======================
TVD_RCVD_SPACE_BRACKET
MISSING_MIME_HB_SEP
FUZZY_CPILL
X_IP Bug #5920 appears not fixed as claimed.
FRT_SOMA2
CTYPE_001C_B
MIME_BASE64_BLANKS
WEIRD_QUOTING
SPF_HELO_FAIL
HTML_IMAGE_RATIO_06
HTML_IMAGE_RATIO_08

Other Changes
============
* EXTRA_MPART_TYPE was left as 1.0 because while it does relatively poorly in
the weeky masscheck, it did far better in rescore masscheck.
* I am increasing the scores of PSBL *after* the above fp-fn-statistics run
because the old logs do not reflect its current safety level.

I am committing these changes now.  I suspect the key to these reductions is
getting rid of the rules that wouldn't have passed our ruleqa auto-promotion
criteria?  There might be additional tweaks to make.  Please comment here.

-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Reply via email to