Careful. If I read the user's initial message correctly, what she calls a false negative most of us would call a false positive, i.e. a ham message identified as spam or potential spam. As Kenny points out, a few false negatives are a common annoyance. But false positives can be a more serious problem, since their presence forces you to slog through rivers of spam looking for good messages you might otherwise miss.
Bob > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Kenny Pitt > Sent: Tuesday, October 11, 2005 10:08 AM > To: mgleich > Cc: [email protected] > Subject: Re: [Spambayes] rebuild database? > > > On 10/10/05, mgleich <[EMAIL PROTECTED]> wrote: > > I've just realized that although my database is 536kb and that is not > > so large, it is composed of 702 spam and 110 ham. I gather this is > > extremely unbalanced and may explain why I'm getting false negatives. > > Actually, 7 to 1 is really not an unusually high imbalance. > We've seen reports from people who have 100 to 1 or higher imbalances. > > If you are getting false positives then imbalance is the most > common cause. A few false negatives are not uncommon, though, > because spam is constantly changing. If a relatively high > percentage of your spam is coming in as false negatives, then > you might have an imbalance problem. The best way to tell for > sure is to see the spam clues for one of the false negatives, > which you can generate from the SpamBayes menu. > > > Do I need to begin from scratch? If so, do I just delete the db file > > and will Spambayes just create a new one? > > For a 7 to 1 imbalance, I would usually say there is no need > to begin from scratch. However, SpamBayes learns quickly so > it shouldn't hurt to start over and see what happens. Since > you know the size of your DB, you've obviously located the > file. You will probably see two files with the *.db > extension, one is the training data and the other contains > information about the messages that have been processed. Just > close Outlook, delete these 2 files, then restart Outlook and > SpamBayes should recreate the databases. _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
