is this not reinventing a bit of the stuff in "masses"? Also, are you using the perceptron? don't ;) the GA produces better results with current spam and rules, I've found. That would explain the poor results on set0, I'd guess.
--j. [EMAIL PROTECTED] writes: > Author: dos > Date: Wed Apr 18 10:35:53 2007 > New Revision: 530098 > > URL: http://svn.apache.org/viewvc?view=rev&rev=530098 > Log: > ignoring tiny scored rules without regard for rules that may depend on them = > not good! don't do it > > Modified: > spamassassin/rules/trunk/sandbox/dos/new-rule-score-gen/merge-scoresets > spamassassin/rules/trunk/sandbox/dos/new-rule-score-gen/scores > > Modified: > spamassassin/rules/trunk/sandbox/dos/new-rule-score-gen/merge-scoresets > URL: > http://svn.apache.org/viewvc/spamassassin/rules/trunk/sandbox/dos/new-rule-score-gen/merge-scoresets?view=diff&rev=530098&r1=530097&r2=530098 > ============================================================================== > --- spamassassin/rules/trunk/sandbox/dos/new-rule-score-gen/merge-scoresets > (original) > +++ spamassassin/rules/trunk/sandbox/dos/new-rule-score-gen/merge-scoresets > Wed Apr 18 10:35:53 2007 > @@ -3,8 +3,6 @@ > use strict; > use warnings; > > -my $min_score = 0.050; > - > my %rules; > > for (my $i = 0; $i < 4; $i++) { > @@ -12,7 +10,7 @@ > while(<SCORES>) { > next unless /^score (\S+)\s+(-?[\d.]+)$/; > @{$rules{$1}} = ('0.000', '0.000' ,'0.000', '0.000') unless exists > $rules{$1}; > - $rules{$1}[$i] = ($2 >= $min_score ? $2 : '0.000'); > + $rules{$1}[$i] = $2; > } > close SCORES; > } > > Modified: spamassassin/rules/trunk/sandbox/dos/new-rule-score-gen/scores > URL: > http://svn.apache.org/viewvc/spamassassin/rules/trunk/sandbox/dos/new-rule-score-gen/scores?view=diff&rev=530098&r1=530097&r2=530098 > ============================================================================== > --- spamassassin/rules/trunk/sandbox/dos/new-rule-score-gen/scores (original) > +++ spamassassin/rules/trunk/sandbox/dos/new-rule-score-gen/scores Wed Apr 18 > 10:35:53 2007 > @@ -10,18 +10,18 @@ > score FB_MED1CAT 1.000 1.070 0.000 0.000 > score FB_MEDS_PERCENT 0.976 2.293 0.000 0.000 > score FB_PIPENEWSLETTER 0.000 0.000 0.000 0.000 > -score FB_REFI 0.000 0.000 0.000 0.000 > +score FB_REFI 0.043 0.000 0.000 0.000 > score FB_SMALL_PEN 0.833 0.000 0.000 0.000 > score FB_WHILECONNECTED 0.000 0.000 0.000 0.000 > -score FB_WORD1_END_DOLLAR 0.000 2.688 0.000 0.000 > +score FB_WORD1_END_DOLLAR 0.001 2.688 0.000 0.000 > score FH_XMAIL_RND_833 0.000 1.835 0.000 0.000 > -score FM_LOTTO_MONEY 0.000 0.000 0.000 0.000 > +score FM_LOTTO_MONEY 0.001 0.000 0.000 0.000 > score FM_LOTTO_YOU_WON 0.107 0.000 0.000 0.000 > -score FM_MORTGAGE3PLUS 0.000 0.000 0.000 0.000 > +score FM_MORTGAGE3PLUS 0.001 0.000 0.000 0.000 > score FM_MORTGAGE4PLUS 0.000 1.000 0.000 0.000 > score FRT_ADULT2 0.311 0.000 0.000 0.000 > score FRT_APPROV 1.000 0.000 0.000 0.000 > -score FRT_COCK 0.000 0.000 0.000 0.000 > +score FRT_COCK 0.001 0.000 0.000 0.000 > score FRT_ERECTION 0.188 0.000 0.000 0.000 > score FRT_FREE 0.159 0.000 0.000 0.000 > score FRT_OPPORTUN1 1.218 1.000 0.000 0.000
