On 04/11, Warren Togami Jr. wrote: >>>> Before that rescoring, we may want to have a serious >>>> discussion about reducing score pile-up in the case where >>>> multiple production DNSBL's all hit at the same time. Adam >>>> Katz' approach is one possibility, albeit confusing to users >>>> because users see subtractions in the score reports. There may >>>> be other better approaches to this.
On 04/12/2011 12:59 PM, [email protected] wrote: >>> What was Adam Katz's approach? Not using black or white lists >>> just because they overlap is unfortunate. So is the reduction of >>> generated scores that overlap probably causes. I have two proposals, both of which have been mentioned here in the past. Warren was referring to the first: 1. One meta to rule them all This is very simple. All it should require is removing the 'nopublish' flag from PUBLISHED_DNSBLS (and probably renaming it to something like "RCVD_IN_DNSBL" or "DNSBL" to avoid confusion). This should result in a large score for the meta and therefore reduced scores for the individual rules. However, as the GA isn't always rational and could miss the overlap and create dangerous scores, we might have to manually score the meta (and/or the lookups). This was mentioned (not for the first time) at http://old.nabble.com/I-want-MORE-SPAM---MORE-SPAM-tt23599323.html#a23602101 It should be noted that KHOP_DNSBL_ADJ and KHOP_DNSBL_BUMP (from my khop-bl sa-update channel) implements this as a third-party hack. The former's purpose is in calculating when the score has been brought too high and then reducing it while the latter focuses on when the score isn't high enough. Such a hack is very very messy and happily completely unnecessary in upstream given a rule like PUBLISHED_DNSBLS. Also note that this process is replicated in KHOP_URIBL_ADJ and there is a similar trick for whitelists in KHOP_RCVD_TRUST. Since I've kept these rules out of subversion, you'll have to view the channel itself. I have a copy of the relevant rule file at: http://khopis.com/sa/khop-bl/khop-bl.cf On 04/14/2011 06:26 AM, Greg Troxel wrote: >> I suggest adding a metarule to combine two blacklists or two >> whitelists, and see what the existing score-generation procedure >> gives it. If my idea is confused, then most such metarules might >> have near-zero scores. If one ends up with A=2 B=4 and A_and_B >> getting -1, that validates the concept. >> >> This is sort of like KHOP_DNSBL_BUMP, but letting the GA set the >> value. Yes, exactly my intent. I couldn't do that on the channel without re-scoring upstream rules, which I really didn't want to do. On 04/14/2011 07:58 AM, John Hardin wrote: > I'd first verify the assumption that the score generator will > generate negative scores. I don't know that it does not, but there > are only 56 rules with negative scores and almost all look manually > assigned. I suspect that automatic generation of negative scores is > intentionally suppressed to inadvertently avoid opening up "magical > bypass" rules for spammers. We shouldn't need negative scores. With the adjuster in the picture, it should get the big score and the RCVD_IN_* dependencies will have reduced scores. ... BIG POTENTIAL HURDLE: users who have tweaked the existing rules will have a very high FP risk. The best solution is therefore to rename everything (yuck!). Regarding desirable negative rules ... tflags nice is a really bad idea since this isn't a nice rule. KHOP_DNSBL_ADJ is (probably) a unique type of case in which a spam rule needs a negative score. >> Perhaps Adam can explain where those scores come from - I certainly >> think they are a good manual guess, but it would be interesting if >> it's more than that. The multipliers in KHOP_DNSBL_ADJ are generated from the scores of the rules they modify so as to approximate the total score coming from the rules in question. I don't keep them in perfect sync (it doesn't matter too much unless they have a dramatic change). As to the score for KHOP_DNSBL_ADJ; that came from the calculated average of the message it was hitting (some math is present in the comments) with the aspiration of reducing the total DNSBL score below five. KHOP_DNSBL_BUMP is matched on a similar philosophy; if a highly trustworthy DNSBL is hit AND the combined DNSBL score isn't already too high, it's safe to add a few points. Its two point score itself is from my own judgment. (That was long enough for one email. My second proposal, regarding a new breed of short-circuiting that would prevent frivolous rule checks including DNSBLs, will be sent in its own email.)
signature.asc
Description: OpenPGP digital signature
