Re: Idea for better scoring

Daryl C. W. O'Shea 16 Nov 2004 09:22:36 -0000

Tom McClure wrote:

Without ditching the current scoring altogether in favor of a multiplicative model (a la bayes), what if there were a post-analysis scoring step that just took into account the total number of positive rules (or rule families, if there is such a division)? Instead of looking at each test as though it occurred in isolation, this can put all the tests into sharper context without throwing away a lot of scoring code.

The attached plugin does this. If 4 or more positively scored tests are hit the message score is 'corrected' according to the following simple formula:

Final Score = Original Score + ((# of positive tests hit) ** 1.2 / 4)

It seems to work well but isn't based on much more than a whim and a little observation. I get very few ham hits on _my_ mail with it, but I mainly get pretty clean looking ham.

Daryl

loadplugin      ScoreCorrection scorecorrection.pm


header          SCORE_CORRECTION eval:scorecorrection()
describe        SCORE_CORRECTION Correction for multiple positive test scores
score           SCORE_CORRECTION 0.001  # just here to enable the plugin

package ScoreCorrection;
use strict;
use Mail::SpamAssassin;
use Mail::SpamAssassin::Plugin;
our @ISA = qw(Mail::SpamAssassin::Plugin);

sub new {
  my ($class, $mailsa) = @_;
  $class = ref($class) || $class;
  my $self = $class->SUPER::new($mailsa);
  bless ($self, $class);
  $self->register_eval_rule("scorecorrection");
  return $self;
}

sub scorecorrection {
  # Stub so we can use "describe SCORE_CORRECTION" in a .cf file by declaring 
an eval rule.
  return 0;
}

sub check_post_learn {
  my ($self, $permsgstatus) = @_;

  my $pos_tests = 0;
  foreach my $test (@{$permsgstatus->{permsgstatus}->{test_names_hit}}) {
    $pos_tests++ if ($permsgstatus->{permsgstatus}->{conf}->{scores}->{$test} > 
0);
  }

  my $correction = 0;
  if ($pos_tests > 3) {
    $correction = $pos_tests ** 1.2 / 4;
  }

  if ($correction) {
    $permsgstatus->{permsgstatus}->{conf}->{scores}->{SCORE_CORRECTION} = 
$correction;
    $permsgstatus->{permsgstatus}->handle_hit('SCORE_CORRECTION');
  }

  return 0;
}

1;

Re: Idea for better scoring

Reply via email to