On Wed, 26 Sep 2007, John Calvert wrote:

> I have decided to restart this whole process... setting the bayes
> database back to its initial state & deleting auto-whitelist file.
> 
> Is it good to use a bayes starter DB ?  If so, where can I get a
> good one.

It's not generally a good idea to use *somebody else's* data for your
starter DB - the nature of their email traffic is not likely to be
similar to yours.

This is why it's a good idea to keep the messages you use to train
your bayes, if you're doing manual training - so that you can correct
training errors, and retrain from scratch if necessary. Of course,
that doesn't scale too well if you have large numbers of users and are
autolearning...

If your users retrieve their email from your server using IMAP, here's 
one thing you can do: set up a SpamAssassin-SPAM and SpamAssassin-HAM 
mail folder in each user's mailbox. Have them move missed spams to the 
SpamAssassin-SPAM folder, and *copy* false positives (SA says it's 
spam when it isn't) to the SpamAssassin-HAM folder. They can (and 
ideally *should*) also copy some legitimate messages to their 
SpamAssassin-HAM folder so that SA can get an idea of what "ham" looks 
like.

You can then train off those folders, and retrain as needed. To manage
the training work, you can rotate those files on a schedule - e.g. on
October 1, everybody's SpamAssassin-HAM becomes
SpamAssassin-HAM-200709, etc.

I have some scripting for that sort of thing here:

  http://www.impsec.org/~jhardin/antispam/


--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]    FALaholic #11174     pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Pelley: Will you pledge not to test a nuclear weapon?
  Ahmadeinejad: CIA! Secret prison in Europe! Abu Ghraib!
                   -- Mahmoud Ahmadeinejad clumsily dodges a question
                                    (60 minutes interview, 9/20/2007)
-----------------------------------------------------------------------
 242 days until the Mars Phoenix lander arrives at Mars

Reply via email to