On Sun, Aug 29, 2004 at 08:47:59AM -0700, Bill Landry wrote:
> Is this correct, that custom rule scores and URIDNSBL scores are ignored by
> bayes auto-learning?  If so, what's the rational behind this?  Is this true
> for both SA 2.6x and SA 3.0?

I'm not sure where he came up with that idea actually.  The "which rules
to skip" decision is very straightforward:

- skip rules with tflags set as "noautolearn", "userconf", or "learn"
- "skip" rules with a 0 score in set 0/1

There's nothing internally that differentiates between "custom" (locally
configured) and "standard" (comes with SA) rules/scores/etc.


As for head vs body ...

The details of head/body can get into a long discussion (see below)
since it's slightly complex.  In short, URIBL rules are considered
header tests and are added appropriately.  (I'm not 100% sure why they're
written as "header:eval" instead of "body:eval", but that's a different
discussion... actually, I've opened ticket 3734 about that.)


The less short version:

"head" rules are usually "header" or "header ... eval", or "meta" rules
that aren't tflag "net" as well.  Things like "header ... eval:check_rbl" are
considered "RBL" rules, not "head" rules though.

"body" rules are "body", "body ... eval", "uri", or "meta" rules that
aren't tflag "net" as well.

The URIBL_* rules are are "header" rules, and they're not considered
"RBL", so they should be considered definite "head" rules.

-D shows the different points as used for autolearning calculations, so a
random sample shows:

[...]
debug: auto-learn: currently using scoreset 1.
debug: auto-learn: message score: 17.258, computed score for autolearn: 17.259
debug: auto-learn? ham=0.1, spam=12, body-points=6.307, head-points=17.259, 
learned-points=0
debug: auto-learn? yes, spam (17.259 > 12)
[...]
debug: 
tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST,MIME_BOUND_DD_DIGITS,RCVD_IN_DSBL,RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL,SPF_HELO_PASS,URIBL_OB_SURBL,URIBL_WS_SURBL,X_MESSAGE_INFO
[...]

Scores are:

DNS_FROM_RFC_ABUSE   0.374
DNS_FROM_RFC_POST    1.376
MIME_BOUND_DD_DIGITS 4.230
RCVD_IN_DSBL         2.765
RCVD_IN_NJABL_DUL    1.655
RCVD_IN_SORBS_DUL    0.137
SPF_HELO_PASS        -0.001
URIBL_OB_SURBL       1.996
URIBL_WS_SURBL       0.539
X_MESSAGE_INFO       4.187

A quick debug addition shows where points get added per rule:

>> DNS_FROM_RFC_ABUSE = body
>> DNS_FROM_RFC_ABUSE = head
>> DNS_FROM_RFC_POST = body
>> DNS_FROM_RFC_POST = head
>> MIME_BOUND_DD_DIGITS = head
>> RCVD_IN_DSBL = body
>> RCVD_IN_DSBL = head
>> RCVD_IN_NJABL_DUL = body
>> RCVD_IN_NJABL_DUL = head
>> RCVD_IN_SORBS_DUL = body
>> RCVD_IN_SORBS_DUL = head
>> URIBL_OB_SURBL = head
>> URIBL_WS_SURBL = head
>> X_MESSAGE_INFO = head

It's a little confusing, but if the rule isn't considered "head" by the
above rules, the points are added to body, and visa-versa, so some rules
get added to both.  Since most of the above ones are "RBL" rules, they're
neither considered "head" nor "body", and are therefore added to both.
In this case, it just so happens that everything adds to "head" and only
the "RBL" ones get added to "body".

-- 
Randomly Generated Tagline:
"To have a right to do a thing is not at all the same as to be right in
 doing it."                      - G.K. Chesterton

Attachment: pgpOgkqWAnl0w.pgp
Description: PGP signature

Reply via email to