https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6155

--- Comment #90 from Mark Martinec <[email protected]> 2009-10-09 06:49:27 
PDT ---
To assess the quality and repeatability of results, here are the summaries
on all four score sets, each pair consists of a normal run on 90% of
entries, and a test run on remaining 10% of log entries.

The most interesting figures are the FP and FN percents, e.g. 0.028% and
0.961%,
in this clipping:
  # False positives:     65  0.011%  (0.028% of nonspam,  10580 weighted)
  # False negatives:   3411  0.578%  (0.961% of spam,  12054 weighted)


==========================================
gen-set0-5-5.0-25000-ga
SCORESET 0 : (no net, not bayes)

test (10%):
# SUMMARY for threshold 5.0:
# Correctly non-spam:  45335  98.03%
# Correctly spam:      39320  81.61%
# False positives:       913  1.97%
# False negatives:      8860  18.39%
# TCR(l=50): 0.883875  SpamRecall: 81.611%  SpamPrec: 97.731%

scores (90%):
# SUMMARY for threshold 5.0:
# Correctly non-spam: 365397  48.193%  (98.401% of non-spam corpus)
# Correctly spam:     314466  41.476%  (81.286% of spam corpus)
# False positives:      5936  0.783%  (1.599% of nonspam, 173347 weighted)
# False negatives:     72396  9.548%  (18.714% of spam, 226867 weighted)
# Average score for spam:  10.0    nonspam: 1.4
# Average for false-pos:   5.6  false-neg: 3.1
# TOTAL:              758195  100.00%

==========================================
gen-set1-10-5.0-30000-ga
SCORESET 1: (net, no bayes)

test:
# SUMMARY for threshold 5.0:
# Correctly non-spam:  46183  99.86%
# Correctly spam:      46648  96.82%
# False positives:        65  0.14%
# False negatives:      1532  3.18%
# TCR(l=50): 10.075282  SpamRecall: 96.820%  SpamPrec: 99.861%

scores:
# SUMMARY for threshold 5.0:
# Correctly non-spam: 370804  48.906%  (99.858% of non-spam corpus)
# Correctly spam:     374579  49.404%  (96.825% of spam corpus)
# False positives:       529  0.070%  (0.142% of nonspam,  31804 weighted)
# False negatives:     12283  1.620%  (3.175% of spam,  39385 weighted)
# Average score for spam:  17.4    nonspam: 0.4
# Average for false-pos:   5.8  false-neg: 3.2
# TOTAL:              758195  100.00%


==========================================
gen-set2-10-5.0-30000-ga
SCORESET 2: (no net, bayes)

test:
# SUMMARY for threshold 5.0:
# Correctly non-spam:  29308  99.78%
# Correctly spam:      42344  95.69%
# False positives:        64  0.22%
# False negatives:      1907  4.31%
# TCR(l=50): 8.664774  SpamRecall: 95.690%  SpamPrec: 99.849%

scores:
# SUMMARY for threshold 5.0:
# Correctly non-spam: 234375  39.745%  (99.864% of non-spam corpus)
# Correctly spam:     339736  57.612%  (95.700% of spam corpus)
# False positives:       320  0.054%  (0.136% of nonspam,  26164 weighted)
# False negatives:     15265  2.589%  (4.300% of spam,  58794 weighted)
# Average score for spam:  10.4    nonspam: 0.6
# Average for false-pos:   5.4  false-neg: 3.9
# TOTAL:              589696  100.00%


==========================================
gen-set3-20-5.0-20000-ga
SCORESET 3: (net, bayes)

test:
# SUMMARY for threshold 5.0:
# Correctly non-spam:  29342  99.90%
# Correctly spam:      43843  99.08%
# False positives:        30  0.10%
# False negatives:       408  0.92%
# TCR(l=50): 23.192348  SpamRecall: 99.078%  SpamPrec: 99.932%

scores:
# SUMMARY for threshold 5.0:
# Correctly non-spam: 234630  39.788%  (99.972% of non-spam corpus)
# Correctly spam:     351590  59.622%  (99.039% of spam corpus)
# False positives:        65  0.011%  (0.028% of nonspam,  10580 weighted)
# False negatives:      3411  0.578%  (0.961% of spam,  12054 weighted)
# Average score for spam:  18.5    nonspam: -0.1
# Average for false-pos:   5.4  false-neg: 3.5
# TOTAL:              589696  100.00%

-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Reply via email to