[email protected] writes: >> If we force "not listed in any" to zero, sort of like rules not hittinng >> is zero score, then for 2 BLs we have 3 rules: A, B and A+B. If A gets >> 2 points and B 1 and they largely overlap, then it seems very likely >> that A+B deserves 2.2ish rather than 3. If one accepts the "score the > > How about giving A+B 2, the greater of the values for A and B?
The problem is that this is an artificial choice that constrains the score for "A+B" and "A" to be the same. If it turns out that A and B are mostly independent, it might be that score(A+B) should be closer to score(A) + score(B). >> I suggest adding infrastructure to declare a set of k scoring rules as >> non-independent, which has the effect of adding 2^k-k-1 joint-situation >> rules that can then be assigned scores different from the sum of the >> individual scores. For k=3, one would need 7 rules total, and thus 4 >> more (AB, AC, BC, ABC). > > If we had sufficient mass-check participants, I agree that would probably > be optimal. But it looks like we're dealing with k=15, so you're talking > about 32,752 more rules for 15 blacklists. And about as many more for > whitelists. Exponents can be a bitch. Agreed. > So what do you think about adding the grouped-rule declaration, as you > suggested, but instead of creating many more rules, when scores are being > tallied for an email, only use the largest score hit out of any rule group? I would suggest to use the full method, but at first to only group whitelists/blacklists that we think are having problems due to overlapping. One could do score generation runs with various pairs in groups and look at the answers. I don't know what the results are going to be, but I suspect that seeing the results of a half-dozen groupings would be very illuminating. > Let those float in rescoring, the same way they're tallied, and the > blacklist (and whitelist) tests should end up with larger scores, since > they aren't forced to be lowered by overlap. I bet a couple of them would > float over 5. I suspect they wouldn't, since any amount of FP in the strongest rule will pull the score down. But really I don't know. I suggest adding a metarule to combine two blacklists or two whitelists, and see what the existing score-generation procedure gives it. If my idea is confused, then most such metarules might have near-zero scores. If one ends up with A=2 B=4 and A_and_B getting -1, that validates the concept. This is sort of like KHOP_DNSBL_BUMP, but letting the GA set the value. Perhaps Adam can explain where those scores come from - I certainly think they are a good manual guess, but it would be interesting if it's more than that.
pgpuHKMpFTz5Q.pgp
Description: PGP signature
