On 1/27/07, Stephen Waits <[EMAIL PROTECTED]> wrote:
Don't waste your time. You're many years behind. Check out dspam. Phenomenal statistical filter.
I respectfully disagree (about it being a waste of time; dspam is great). The currently available filters only focus on email when wiki spam and blog spam are big problems. C, the implementation language for dspam and CRM114, two of the most sophisticated statistical filters, is a bad language for exploratory programming. Filter authors are not, as far as I can tell, engaging in black magic, they're applying textbook pattern recognition algorithms along with some novel, domain-specific feature selection and data preprocessing heuristics. It's straightforward albeit involved. Warren _______________________________________________ Sdruby mailing list [email protected] http://lists.sdruby.com/mailman/listinfo/sdruby
