Hi, I've finally finished my first pass at a set of SPAM "blocking" routines, based on Paul Grahams' "A Plan for Spam" which can be found at http://www.paulgraham.com/spam.html.
Included is a mailet for setting a message header based upon whether the message appears to be spam or not. (Just a simple Yes or No for now.) A mailet to feed ham/spam to the "corpus" by sending the message to a specific email address (one for spam, one for ham). The core classes that can be used outside of James to perform bulk updates of the corpus, or for use with other systems. I've (hopefully) attached all the source necessary (no it doesn't have the official formatting, and I've not put in all the nice documentation, but it's a start) to try it out on your own system. This version is built with MySQL in mind as the backend, however, it's designed to easily allow for a different mechanism for persisting the statistics. Let me know if you need help implementing this... Note, this hasn't been put under much of a load for testing (consider it Alpha code), and there's probably a big bottle-neck in the JDBCBayesianAnalysisFeeder mailet as it immediately updates the SQL backend with new statistics...which may not be a good idea... See the readme.txt in the ZIP for a bit more info... Please let me know if you try this out, and what problems (if any) you run into...especially if you find any bugs <g> as I may have translated the Bayesian routines incorrectly. This was designed and developed against James v2.0a3, and JDK 1.3. I'll let this sit for a while and provide updates once I get some feedback. Thanks. -Chris
James-BayesianAnalysis.zip
Description: Zip compressed data
-- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
