Adam Katz wrote: > Bowie Bailey wrote: > >> Since the sought rules have been updating for a while now, I took a >> look at my stats to see how they were doing. They used to be one >> of my most useful rules, but recently, they don't seem to be doing >> so good. >> >> Here are the stats for the last month: >> > > That looks like the sare stats script (modified to show all rules as > evidenced by rank 261). It doesn't account for FPs or FNs. I > reformatted your output so it wraps well for email. >
Exactly. I told it to show me the top 400 so I could see the stats for all the sought rules. Thanks for the reformat. Call me lazy... :) >> TOP SPAM RULES FIRED >> ------------------------------------------------------------ >> RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM >> ------------------------------------------------------------ >> 111 JM_SOUGHT_FRAUD_3 112 0.06 0.36 0.97 0.01 >> 154 JM_SOUGHT_2 53 0.03 0.17 0.46 0.16 >> 214 JM_SOUGHT_3 31 0.02 0.10 0.27 0.51 >> 253 JM_SOUGHT_1 21 0.01 0.07 0.18 0.01 >> 261 JM_SOUGHT_FRAUD_2 19 0.01 0.06 0.17 0.01 >> ------------------------------------------------------------ >> >> TOP HAM RULES FIRED >> ------------------------------------------------------------ >> RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM >> ------------------------------------------------------------ >> 85 JM_SOUGHT_3 99 0.08 0.32 0.27 0.51 >> 161 JM_SOUGHT_2 30 0.03 0.10 0.46 0.16 >> 351 JM_SOUGHT_FRAUD_3 2 0.00 0.01 0.97 0.01 >> 365 JM_SOUGHT_FRAUD_2 2 0.00 0.01 0.17 0.01 >> 378 JM_SOUGHT_1 1 0.00 0.00 0.18 0.01 >> ------------------------------------------------------------ >> > > That is quite different from our masscheck stats. Today's results at > http://ruleqa.spamassassin.org/20100201/%2FJM_SOUGHT look like this: > > SPAM% HAM% S/O RANK SCORE NAME > 9.8564 0.0042 1.000 0.94 0.01 T_JM_SOUGHT_3 > 8.1587 0.0068 0.999 0.93 0.01 T_JM_SOUGHT_2 > 11.6464 0.0289 0.998 0.89 0.01 T_JM_SOUGHT_1 > 0 0 0.500 0.48 0.00 JM_SOUGHT_FRAUD_1 > 0 0 0.500 0.48 0.00 JM_SOUGHT_FRAUD_2 > 0 0 0.500 0.48 0.00 JM_SOUGHT_FRAUD_3 > > > Here are my own numbers, as observed by a custom script which > recalculates results based on re-scoring specific rules. "Rejected" > requires a score of 8.0 and "flagged" requires 5.0. (It only examines > three rules at a time, and we got 33 messages between my runs.) > > JM_SOUGHT_1 ( 0.3% of 34831 total) with score-bump of -4: > 124 rejected > 1 flagged, with 0 (0%) that would have been rejected > 1 passed, with -1 (-0.0%) that would have been flagged > JM_SOUGHT_2 ( 0.2% of 34831 total) with score-bump of -4: > 47 rejected > 8 flagged, with -2 (-0.1%) that would have been rejected > 24 passed, with -8 (-0.0%) that would have been flagged > JM_SOUGHT_3 ( 0.5% of 34831 total) with score-bump of -4: > 121 rejected > 10 flagged, with -3 (-0.1%) that would have been rejected > 60 passed, with -10 (-0.0%) that would have been flagged > JM_SOUGHT_FRAUD_1 ( 0.0% of 34864 total) with score-bump of -3: > 34 rejected > 0 flagged, with 0 (0%) that would have been rejected > 0 passed, with 0 (0%) that would have been flagged > JM_SOUGHT_FRAUD_2 ( 0.5% of 34864 total) with score-bump of -3: > 203 rejected > 0 flagged, with 0 (0%) that would have been rejected > 0 passed, with 0 (0%) that would have been flagged > JM_SOUGHT_FRAUD_3 ( 1.3% of 34864 total) with score-bump of -3: > 486 rejected > 0 flagged, with -4 (-0.2%) that would have been rejected > 1 passed, with 0 (0%) that would have been flagged > > My script was mostly written for adding points rather than subtracting > them, so the notation is a little less intuitive. For example, rule 2 > moved two mails from flag to reject and caused eight mails to get flagged. > > Recall that unlike the masscheck (which is hand-verified), log parsers > like the sare script and my own script have no knowledge of FPs or > FNs. I bet most if not all of the 86 messages that the SOUGHT rules > noticed but didn't push up to the 5.0 mark were probably FNs. > > Of course, the reason I have a flag threshold and a reject threshold > is so that I can still deliver low-scoring FPs. My users get them > flagged as spam, with SA's spaminess score in the subject. That means > instead of risking a loss of 86 messages, I only risked losing 9, and > thanks to the smtp-time reject nature of my implementation, the > senders got notices of the deliver failure. I have not yet had a > complaint of these rejections based on SOUGHT rules. (The complaints > are rare enough and usually related to massive misconfigurations on > the sending relay.) > I understand the problem with the stats program and FP/FN, but the last time I looked at the stats for sought (which was admittedly quite a while ago), a couple of the rules were showing in my top 20 spam rules. Now I have to go all the way down to 111 to find the first one. -- Bowie