> I was checking out the state of spam filtering on my servers > today and noticed in the logs a lot of the following errors: > > Traceback (most recent call last): [...] > assert spamcount <= nspam > AssertionError [...]
What this means is that there are one or more tokens that have been seen in more spam messages than you have trained (obviously impossible). This error is pretty uncommon these days - the most likely way for it to occur is if writing the database was somehow interrupted (but the database itself didn't get corrupted). Or, since you mention upgrading, maybe this was caused by an old bug that's been fixed (depending on how old the version you were using is). > I even tried to upgrade to the latest 1.0.4 version and it's still > happening. Once it's happened, it will continue to happen (for any message that has the bad tokens in it) until the database is fixed. Hopefully upgrading will prevent it happening again, though. There are two ways to fix this problem: * Remove the existing database and retrain from scratch (recommended, since there might be other problems with the database, which this would fix). * Convert the database to CSV (with the sb_dbexpimp.py script), open it in a text editor or spreadsheet, and change the initial two numbers to be greater than or equal to the numbers in the ham/spam columns (that should make more sense once you're looking at the file). =Tony.Meyer -- Please always include the list (spambayes at python.org) in your replies (reply-all), and please don't send me personal mail about SpamBayes. http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this. _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
