Miroslav Vancl writes: > > Now I found I'm not first with this idea. I had found the same idea from Tom > Allison - 2 Jun 30, 2007 even with similar subject "A different approach to > scoring spamassassin hits". > > I'm sorry
Don't be sorry! It's well worth experimenting with. The best way to do this is to come up with a prototype that demonstrates good results on "real-world" mail corpora. I've tried it in the past, but it (surprisingly) didn't produce great results. My theory: we have very few "hammy" rules. This is because it's trivial for spammers to impersonate them, so it's safer to start from an assumption of hamminess and allow attributes of the mail push it in one direction only -- towards spamminess. That works well for us, but not so good for the bayes-like approach. One possible fix for this would be to consider the fact that a rule did NOT hit to be a token, as well as the opposite. --j.
