https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6155
--- Comment #7 from Justin Mason <[email protected]> 2009-08-17 15:21:17 PST --- ok, I think I've ironed out a couple of issues. Let's see what people think of these sample scores: http://taint.org/x/2009/gen-set0-2.0-5.0-500-ga_scores http://taint.org/x/2009/gen-set1-5.0-5.0-500-ga_scores http://taint.org/x/2009/gen-set2-2.0-5.0-500-ga_scores http://taint.org/x/2009/gen-set3-5.0-5.0-500-ga_scores here are the test results against the "test" fold for each scoreset: gen-set0-2.0-5.0-500-ga/test Reading scores from "tmprules"... Reading per-message hit stat logs and scores... # SUMMARY for threshold 5.0: # Correctly non-spam: 26453 99.07% # Correctly spam: 83369 81.53% # False positives: 249 0.93% # False negatives: 18882 18.47% # TCR(l=50): 3.263469 SpamRecall: 81.534% SpamPrec: 99.702% gen-set1-5.0-5.0-500-ga/test Reading scores from "tmprules"... Reading per-message hit stat logs and scores... # SUMMARY for threshold 5.0: # Correctly non-spam: 26646 99.79% # Correctly spam: 100943 98.72% # False positives: 56 0.21% # False negatives: 1308 1.28% # TCR(l=50): 24.890701 SpamRecall: 98.721% SpamPrec: 99.945% gen-set2-2.0-5.0-500-ga/test Reading scores from "tmprules"... Reading per-message hit stat logs and scores... # SUMMARY for threshold 5.0: # Correctly non-spam: 26485 99.19% # Correctly spam: 84218 82.36% # False positives: 217 0.81% # False negatives: 18033 17.64% # TCR(l=50): 3.540179 SpamRecall: 82.364% SpamPrec: 99.743% gen-set3-5.0-5.0-500-ga/test Reading scores from "tmprules"... Reading per-message hit stat logs and scores... # SUMMARY for threshold 5.0: # Correctly non-spam: 26662 99.85% # Correctly spam: 100964 98.74% # False positives: 40 0.15% # False negatives: 1287 1.26% # TCR(l=50): 31.107697 SpamRecall: 98.741% SpamPrec: 99.960% Yes, set0 and set2 are terrible. This is pretty much what happened last time, too; our ruleset is pretty crappy nowadays without network rules active. But the net rule results are very good! However I think I need to look into the local rule GA runs if possible. Bug 5270 is the 3.2.0 rescoring run, for reference. Spamhaus will be happy to see a much improved score for RCVD_IN_PBL ;) gen-set1-5.0-5.0-500-ga_scores:score RCVD_IN_PBL 2.596 gen-set3-5.0-5.0-500-ga_scores:score RCVD_IN_PBL 2.411 -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
