On Mon, Dec 12, 2011 at 11:26 AM, Warren Togami Jr. <[email protected]> wrote: > * Was setting these scores manually your response to my concerns about > "reuse" and the difficulties we will face in GA rescoring? I might > even agree with this solution, although I believe it can be refined > with further discussion.
Such high static scores don't leave much room for the GA rescoring to balance the other DNSBL scores. We would have have better statistics-driven results if we let Mailspike's rules float, do GA rescoring, manually adjust the results, then compare the before and after fp-fn ratios to be sure it is sane. 1) First we need to consider the "reuse" issue I mentioned earlier - decide if we will reuse or not depending on the status of the participating corpora. 2) Fix the _BL composite score to 0.01 during the GA balancing, allowing the _L's and _ZBI to balance along with the other rules. The results will NOT be linear, so the manual adjustment afterward is to make it linear. _BL remains as an informational rule in the final scoreset that helps us to make an apples-to-apples comparison with other DNSBL's. The whitelist situation is more complicated than I have time to fully write now. I think we need to reconsider the entire whitelist situation and set a consistent policy across all whitelists. I suspect you misunderstood my stance on whitelists ... I am actually FOR whitelist, just we need to be careful about how they are scored. The current DNSWL controversy regarding how they punish misuse is a separate issue from how whitelists are selected and scored.
