> My question is there available a collection of recent SPAM emails > available that I can access to train SpamBayes?
You can get spam from <http://spamarchive.org> and the SpamAssassin Public Archive <http://spamassassin.apache.org/publiccorpus/>. Generally this isn't a good idea, though. You should try to just train on spam that you get, as that will look most like any future spam that you get. I agree with Jesse that doing train-on-error (including unsures) is the best way to go. <http://entrian.com/sbwiki/TrainingIdeas> has much more about this. > The problems started when it would classify obvious "SPAM" as > "HAM" but when you went to correct it, the email was not in the > review screen, If mail doesn't appear in the review page (and you haven't changed any of the caching options, and it's less than 7 days ago), then it most likely *wasn't* classified as ham, it probably failed to be classified. Check the message for a "X-Spambayes-Exception" header. If there is one, then it indicates that something went wrong while trying to classify. If you can't figure out what the problem is (the FAQ may help), then ask here, including the content of the exception header. > With 1.04 I already have over 100 SPAM emails that it has > received and still all it says is that it's "unsure" it > almost never categorizes anything as "SPAM" anymore. See FAQ 4.7: <http://spambayes.org/faq.html#why-did-spambayes-mark-this-obvious-spam-unsu re>. The most likely cause is a database imbalance, but we need to see clues to know for sure. =Tony.Meyer -- Please always include the list (spambayes at python.org) in your replies (reply-all), and please don't send me personal mail about SpamBayes. http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this. _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
