Hello Al, AJ> Could you share your spam filters with the group? It seems I am AJ> constantly adding words and I know there must be another, simpler way.
I'm doing this a bit differently. Instead of trying to invent my own spam filters (reinvent the wheel, again), I'm using some free tools that are available. My mail is scanned for "spamability" on my linux box using the free Spam Assassin (www.spamassassin.org) and MIME Defang tools (www.roaringpenguin.com). By the time TB downloads my messages they've already been assessed for SPAM probability. But those of you without an email server you can add antispam software to, you're in luck. There's also free Win32 version that you can very easily use with ANY email client called SAProxy (http://saproxy.bloomba.com/). It will scan messages and flag them as SPAM (based on your configuration) as they are downloaded from your server to your client. I use this to filter my work email, since my employer doesn't have any spam filtering system in place. To use it simply change two things in your email client configuration and voila, now all of your incoming mail is scanned for Spaminess, and marked up accordingly (rewriting subject, adding a hidden header line, or whatever you configure it to do). Then you can add a filter to TheBat to act on this flag (if header X-SPAM: Yes exists, then automatically delete message, for example). Works very, very well. Spam Assassin uses more than looking for keywords like "sex" and "viagra", etc. It will look for things like SHOUTING TOO MUCH, as seen on tv, subjects with numbers at the end, too many recipients, unlisted recipients, click here to remove, etc. etc. Heck, this message might get flagged just because it contains to many phrases. It also looks for things like single-image HTML emails (the latest trick by spammers to get around keyword filters), HTML emails with too much red text, etc. Someone else is already trying to keep ahead of spammers, I don't have the time to update it myself. Plus, it can optionally plug into the Razor and other RBL databases which lets you fingerprint messages and compare incoming messages to a database of known spam messages. It also lets you reject emails from known spam-only hosts. This is all stuff TB filters can't handle, can't handle very well, or would be a complete bear to set up and maintain. So I'll let my email server or SAProxy on my desktop determine if it thinks the message is SPAM or not, and then I'll filter in TB based on what SpamAssassin determines. But the coolest tool on the horizon is contained in latest version of Spam Assassin: Bayesian filtering. This sounds to be the most promising *learning* method of distinguishing SPAM from "HAM" (good messages). From what I read it has a very low (1%) false-positive rate. That's better that practically all other tests in use at the moment. But to make Bayesian filtering work well, you need to send it SPAM and HAM messages so it can learn the difference. So, what I'm ultimately looking for is a way to save messages designated as SPAM in MBOX format with original headers. Same with my HAM messages. Then I need to send these two mboxes to my linux box and tell Spam Assassin to "train" itself using the latest email messages I'm feeding it. It's this training process I would like to automate. Anyhow, I thought you might find all that interesting... :) -- James ________________________________________________ Current version is 1.62 | "Using TBUDL" information: http://www.silverstones.com/thebat/TBUDLInfo.html

