"Dan Mosedale" <[EMAIL PROTECTED]> wrote in
aftqu6$[EMAIL PROTECTED]">news:aftqu6$[EMAIL PROTECTED]...

> I've spent some time looking over the various spam filters that are
> available on the net lately, and I've heard a bunch of good things about
>   SpamAssassin.  I suspect we'd do well to model our filtering after
> theirs; they have useful ways of classifying things such as whether
> matches happen in headers, body, both, etc.

Have you considered a spam filter based on machine learning? There doesn't
seem to be one on the market, but ResearchIndex [1] knows some, for example
[2] to [4]. Most of these are based on statistics and are quite accurate.
For a comparison between rule-based and statistical filters see "An
Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam
Filtering with Personal E-mail Messages" by Ion Androutsopulos et. al.

I'm working on a prototype utilizing these algorithms right now, but it's in
Java, so I don't know if it is of any help for mozilla.


Regads,


Kai

[1] http://citeseer.nj.nec.com/cs
[2] http://citeseer.nj.nec.com/sahami98bayesian.html
[3] http://citeseer.nj.nec.com/androutsopoulos00learning.html
[4] http://arxiv.org/abs/cs.CL/0006013
[5] http://arxiv.org/abs/cs/0008019



Reply via email to