The most common training problem we generally encounter is too much spam in
the database.  Martin v. Loewis and I implemented a SpamBayes trainer for
bugs.python.org which seems to be exhibiting the reverse.  I'm just
training now through the web interface (looks just like the SpamBayes POP3
proxy training interfac).  I see about 25% unsures and the rest hams.  A
couple messages scored as spam, but they were mistakes.  

I'm only training a couple unsures out of each day as ham and discarding the
rest.  Still, I've got a database with essentially all hams.  What might I
expect in the way of accuracy?

Skip

_______________________________________________
spambayes-dev mailing list
spambayes-dev@python.org
http://mail.python.org/mailman/listinfo/spambayes-dev

Reply via email to