http://bugzilla.spamassassin.org/show_bug.cgi?id=3228

           Summary: RFE: Performance improving with two score thresholds
           Product: Spamassassin
           Version: unspecified
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: spamassassin
        AssignedTo: [EMAIL PROTECTED]
        ReportedBy: [EMAIL PROTECTED]


Currently spamassassin performs all checks on headers, body, RBL, etc.
After performing all the checks it compares the score value with "required_hits"
and marks the message as spam if the score is greater than required hits.
Suppose that a message has a high score: sa doesn't need to perform all the
checks, but it can stop the checks and mark the message as a spam just after the
score become greater than "required_hits".
I think it could be interesting using a two thresholds, like "required_hits_low"
and "required_hits_high". 
Using this two values sa can classify messages in three buckets:
1) Non spam
2) Probable spam
3) Certain spam

The behaviour of sa should be something like the following.
Assuming that checks are sorted by speed:

begin
foreach (check in all_checks) {
   do(check);
   if (score > required_hits_high)
       goto out;
}

out:
if (score < required_hits_low)
    message_rated_as_non_spam;
elsif ( score > required_hits_high)
    message_rated_as_certain_spam;
else 
    message_rated_as_probable_spam;
end;

So with "required_hits_low" you specify the score which identify spam.
With "required_hits_high" you can specify a limit to CPU usage
when you encounter a high score message.
Additionally you can have two different subject_tag.

I wrote a patch for the Debian/unstable sa package (2.63-1) and I'm currently
testing it.
The patch modify a the check() method in PerMsgStatus.pm and a few lines in
Spamassassin.pm and (obvious) in configuration file.

I can send the patch if you like it.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to