Hi,

This is my first attempt at developing a mailet...so if I've made a mistake
about how best to implement this...or if I've just done something dumb in my
code (cos I'm no Java guru either) please let me know.

I'm following through with a posting I saw on /. regarding using word
occurance statistics to be able to filter out SPAM from legit messages.
Here's the original article: http://www.paulgraham.com/spam.html

I saw this as a two part development.

Part 1:
  Routines for building good/bad word-token statistics.

Part 2:
  Using the statistics to route or flag new messages as SPAM or not.

Attached is my first pass at the code for Part 1.

As I decided to get familiar with JDBC with James at the same time, I've
coded this to use JDBC as the repository...that may not be the best approach
as it introduces a time lag at start up (as it loads the existing
words/occurances) and at shutdown, as it persists the new statistics back
into the database.

Let me know what you guys think of my approach...etc.

P.S.  Hopefully, there's something I don't understand about how to develope
under James.  I'm using JBuilder 4 to compile my code, then I've got to
update the James.bar (which JBuilder doesn't recognize as a jar repository)
with the new class file, then restart James.  I realize there's probably no
easy way around restarting James, but it would be nice to skip updating the
.bar all the time...is there a way to do this?

Thanks.

-Chris

Attachment: JDBCTokenCounter.java
Description: JavaScript source

Attachment: database.sql
Description: Binary data

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to