1. update bogofilter's wordlists with every incoming message, using the -u option. if i understand it, -u will first classify the spam, then update bogofilter's wordlist. that seems like asking for trouble. if you filter to /dev/null based on bogofilter's output, how do you correct mistakes? and it seems like mistakes here will cause more mistakes in the future.
i assume you do this with:
:0fw | bogofilter -f -p -u -l -e -v
also, shouldn't there be a "c" in the procmail colon line? how does mail get past this recipe? isn't it considered "delivered" when an email matches a recipe unless you use ":0c"?
A procmail recipe tagged with "f" is a filtering recipe. Procmail pipes the message through the specified program, then continues on using the filtered version of the message. It's not a delivering recipe, so "c" isn't needed.
I seeded bogofilter just like you did. I use maildirs for my email so every message is in a separate file, so I built a big list of every message less than a year old, divided them into spam & non-spam, and piped each set into bogofilter.
Incoming mail is piped through this set of rules:
:0 fw
| /usr/bin/bogofilter -u -2 -p -e # Spam? Save it in the spam folder
:0
* ^X-Bogosity: (yes|spam)
$SPAMIt's a good idea to collect your spam rather than deleting it. You might want to delete your wordlist one day and build a new one; you'll need a collection of current spam to do that. More important, any time bogofilter makes a mistake you need to correct it, whether it was a false positive or false negative. I can't remember the last time I found non-spam in my spam folder, but it does happen from time to time.
You'll need to find a method of feeding mail back into bogofilter that works for you. I copy the mail into a special mailbox that's swept by a cron job several times per day. These messages are fed back into procmail using a special set of rules:
# Messages labelled spam. Tell bogofilter it's not, and save to INBOX
:0
* ^X-Bogosity: (Spam|Yes)
{
:0 c
| /usr/bin/bogofilter -Sn :0
$DEFAULT
}# Messages not labelled spam.
:0 E
{
:0 c
* ^X-Bogosity: (ham|no)
| /usr/bin/bogofilter -Ns :0
$SPAM
}Note I'm not using bogofiler as a filter this time. Without -p (passthrough mode) it won't output a new copy of the message with the corrected spam header.
--
"We actually do 100,000 pages or more a day in Bork"
-- Marissa Mayer, Google
Kenneth Herron [EMAIL PROTECTED] 916-366-7338
_______________________________________________
vox-tech mailing list
[EMAIL PROTECTED]
http://lists.lugod.org/mailman/listinfo/vox-tech
