On Thursday, February 02, 2006 10:35 PM -0600, Bob Posert wrote: > Back in > http://mail.python.org/pipermail/spambayes/2006-January/018702.html > , Tim Peters and I had a dialog about training on unusual ham - > monthly messages from http://www.boldtype.com. I just got another > one and it scored 50% on the spam scale. The clues follow - I'd > really appreciate any help. Thanks, Bob > > Combined Score: 50% (0.5) Internal ham score (*H*): 1 > Internal spam score (*S*): 1 > > # ham trained on: 1229 > # spam trained on: 20331
Something else worth mentioning is the large total number of messages in the training set. While there isn't much evidence that I'm aware of that says this harms accuracy, most people are able to get very good results with a few hundred to a few thousand trained messages. Some have reported good results with on the order of 50 of each type. If nothing else, this makes the databases very large. -- Seth Goodman _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
