https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6247
--- Comment #54 from Justin Mason <[email protected]> 2009-12-17 09:38:24 UTC --- Thanks for measuring this, Warren. (In reply to comment #51) > rescore masscheck logs with mcsnapshot + today's trunk Scores > ======================================================= > # SUMMARY for threshold 5.0: > # Correctly non-spam: 703979 99.95% > # Correctly spam: 2562432 98.39% > # False positives: 387 0.05% > # False negatives: 41888 1.61% > # TCR(l=50): 42.527842 SpamRecall: 98.392% SpamPrec: 99.985% vs. > rescore masscheck logs with mcsnapshot + today's trunk Scores > HABEAS, BSP and SSC and DNSWL Disabled > ========================== > # SUMMARY for threshold 5.0: > # Correctly non-spam: 703899 99.93% > # Correctly spam: 2565063 98.49% > # False positives: 467 0.07% > # False negatives: 39257 1.51% > # TCR(l=50): 41.597904 SpamRecall: 98.493% SpamPrec: 99.982% So that means that 467-387 --> 80 80/(703899+467) --> 0.000113577316338381 => 0.01135% of hams were rescued from being FPs by the DNSWL rules. 41888-39257 --> 2631 2631/(2565063+39257) --> 0.00101024451680285 => 0.101% of spams were, conversely, allowed through by them. Good to know, and good to get an idea of the problem. fwiw, I think it's better to use the rescore data for this measurement -- more contributors, more varied logs, and (hopefully) the data will have received more hand-checking before submission. by the way I don't know if it's safe to say whether or not this is "statistically significant". We don't know what the null hypothesis is in this case to use that terminology. --j. -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
