"Dan Mosedale" <[EMAIL PROTECTED]> wrote in aftqu6$[EMAIL PROTECTED]">news:aftqu6$[EMAIL PROTECTED]...
> I've spent some time looking over the various spam filters that are > available on the net lately, and I've heard a bunch of good things about > SpamAssassin. I suspect we'd do well to model our filtering after > theirs; they have useful ways of classifying things such as whether > matches happen in headers, body, both, etc. Have you considered a spam filter based on machine learning? There doesn't seem to be one on the market, but ResearchIndex [1] knows some, for example [2] to [4]. Most of these are based on statistics and are quite accurate. For a comparison between rule-based and statistical filters see "An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-mail Messages" by Ion Androutsopulos et. al. I'm working on a prototype utilizing these algorithms right now, but it's in Java, so I don't know if it is of any help for mozilla. Regads, Kai [1] http://citeseer.nj.nec.com/cs [2] http://citeseer.nj.nec.com/sahami98bayesian.html [3] http://citeseer.nj.nec.com/androutsopoulos00learning.html [4] http://arxiv.org/abs/cs.CL/0006013 [5] http://arxiv.org/abs/cs/0008019
