Without ditching the current scoring altogether in favor of a multiplicative model (a la bayes), what if there were a post-analysis scoring step that just took into account the total number of positive rules (or rule families, if there is such a division)? Instead of looking at each test as though it occurred in isolation, this can put all the tests into sharper context without throwing away a lot of scoring code.
The attached plugin does this. If 4 or more positively scored tests are hit the message score is 'corrected' according to the following simple formula:
Final Score = Original Score + ((# of positive tests hit) ** 1.2 / 4)
It seems to work well but isn't based on much more than a whim and a little observation. I get very few ham hits on _my_ mail with it, but I mainly get pretty clean looking ham.
Daryl
loadplugin ScoreCorrection scorecorrection.pm
header SCORE_CORRECTION eval:scorecorrection() describe SCORE_CORRECTION Correction for multiple positive test scores score SCORE_CORRECTION 0.001 # just here to enable the plugin
package ScoreCorrection; use strict; use Mail::SpamAssassin; use Mail::SpamAssassin::Plugin; our @ISA = qw(Mail::SpamAssassin::Plugin); sub new { my ($class, $mailsa) = @_; $class = ref($class) || $class; my $self = $class->SUPER::new($mailsa); bless ($self, $class); $self->register_eval_rule("scorecorrection"); return $self; } sub scorecorrection { # Stub so we can use "describe SCORE_CORRECTION" in a .cf file by declaring an eval rule. return 0; } sub check_post_learn { my ($self, $permsgstatus) = @_; my $pos_tests = 0; foreach my $test (@{$permsgstatus->{permsgstatus}->{test_names_hit}}) { $pos_tests++ if ($permsgstatus->{permsgstatus}->{conf}->{scores}->{$test} > 0); } my $correction = 0; if ($pos_tests > 3) { $correction = $pos_tests ** 1.2 / 4; } if ($correction) { $permsgstatus->{permsgstatus}->{conf}->{scores}->{SCORE_CORRECTION} = $correction; $permsgstatus->{permsgstatus}->handle_hit('SCORE_CORRECTION'); } return 0; } 1;