Re: Over-scoring of SURBL lists...

Bill Landry Thu, 16 Feb 2006 09:05:23 -0800

----- Original Message -----From: "Matt Kettler" <[EMAIL PROTECTED]>

> Take for example this ONE uri that was posted to the list:
> checpri *MUNGED*.com
>
> This is currently listed in SC, JP, and AB on SURBL.
> score URIBL_AB_SURBL 0 3.306 0 3.812
> score URIBL_JP_SURBL 0 3.360 0 4.087
> score URIBL_SC_SURBL 0 3.600 0 4.498

First, I'll point out your statement is confusing. I understand it, but
many won't because SA does currently use multi.surbl.org. The only other
actual query it uses right now is SBL. If I understand you correctly
your argument is you should have one rule, for all of SURBL, with no
differentiation between JP, WS, SC, AB, OB and PH. However, it comes
across as "you shouldn't make individual queries to all these different
lists", which is something SA doesn't do.

As you say, SA does not make individual querier to all of the differentSURBL lists, however, remember that SURBL is bitmask based, and a singlequery to multi can comes back with a single bitmasked response that can meanone hit or many, and SA scores based on the bitmask response that come back.

Personally I'd like to see a rule for SURBL as a whole, followed by
add-on rules for individual lists. This would allow most of the score to
be common to all the lists, and then some small adjustments for
individual lists that perform better.

I could see something like +2.5 for the base rule of hitting any SURBL,
and +0.5 to +1.0 for the individual lists.

You can already adjust the individual test scores (see the snippet from50_scores.cf you showed above).

This would still rack up a lot of points for multiple hits, but not
nearly as many as right now. Hitting the "strongest 3" would only rack
up 5.5, something that BAYES_00 could cover in case of a FP, instead of
12.397, which pretty much nothing can compensate for.


Adjust your individule SURBL test scores to meet your needs.

As for multi.uribl.com.. quite frankly the only list there worth scoring
at any reasonable level is black. I definitely wouldn't want to average
out the hit rates of black and grey together into a single rule. It
would score abysmally low due to grey's occasional (and intentional)
FP's. If you go with a single rule for uribl.com, it should only check
black.

URIBL is bitmasked, as well. Adjust your individual test scores to meetyour needs, or set them to zero if you don't want to use, for example,URIBL_GREY.

Even running those two lists, the scores should be set so that even if
both rules hit, the score still isn't over you threshold. Just under it.


This is good advise and is how we have adjusted our own rules and scores.

Bill

Re: Over-scoring of SURBL lists...

Reply via email to