Henrik Krohns wrote: > I only have to look at my mail logs from today, and I see dozen of legimate > RDNS_NONE hits originating from real people. I'm happy to greylist it at > MTA, but not block directly. > > As said, it's a site policy. Some people use high FP BLs also happily. Many > people might not report FPs for one reason or another, but it doesn't mean > they don't exist.. I like to be on the safe side.
The question is what defines "safe" and why is the score pinned to 0.1? Isn't the whole point of the genetic algorithm to determine what "safe" value to assign it? Who's to say that 0.2 isn't safe? (I suppose there's no way to *cap* a GA score rather than just pin it?) SA is a system of probabilities. We don't define ham as having 0 or fewer points. Again, I cite the masscheck results. Is 1.7% of the ham corpus bad? What about MIME_HTML_ONLY's 3.7% ham, or RCVD_IN_SPAMCOP_BL's 1.3% ham or SUBJ_ALL_CAPS's 1.8%, ...? All of those have GA-generated scores over 0.1. What about the fact that this only scores 0.8528% corpus overlap for ham scoring 4+? (like RDNS_NONE, MIME_HTML_ONLY's 3.7% ham overlap is mostly low-scoring ham, with only 1.5625% matching corpus ham at 4+). Even the latest scoring proposal here has this line: score HTML_MESSAGE 2.199 0.838 1.473 0.511 despite HTML_MESSAGE hitting 40.9% of the ham corpus. Here are some that hit a larger portion of the ham corpus than of the spam corpus despite having positive scores in bugzilla attachment 4553 (the latest scoring proposal) at https://issues.apache.org/SpamAssassin/attachment.cgi?id=4553 MIME_QP_LONG_LINE FREEMAIL_FROM TVD_SPACE_RATIO EXTRA_MPART_TYPE (among others) These were found by applying this search to the front page at http://ruleqa.spamassassin.org (using a firefox regexp search add-on) /(\s+[\d.]+){2}\s+[1-9][\d.]+(\s+[\d.]+){3}\s+(?!T_)\w/ In shell (guess who's bourne scripting is better than his perl?), elinks -dump http://ruleqa.spamassassin.org/ |perl -ne 'print if /(\s+[\d.]+){2}\s+[1-9][\d.]+(\s+[\d.]+){3}\s+(?!T_)\w|\sMSECS/' |tee rules.txt for rule in `perl -ne 'if (/.*\s([A-Z]+\w*_\w*)/) { s//$1/; print; }' <rules.txt`; do grep "^[^#]* $rule " /tmp/50_scores_newest.cf; done
