> Aargh! I just painstakingly rebuilt my database by creating > as accurate a spam and ham files as I could.
What did you use to rebuild? It seems likely that the problem occurred then. > [EMAIL PROTECTED] ~]$ sb_dbexpimp.py -e -p .hammiedb -f > hammiedb.csv Traceback (most recent call last): [...] > tempbayes = pickle.load(fp) > EOFError > > Suggestions? Is .hammiedb a pickle or bsddb database? It looks like it's a bsddb database, but you're telling sb_dbexpimp.py that it's a pickle. Try this: sb_dbexpimp.py -e -d .hammiedb -f hammiedb.csv > Anyway, this corruption of tokens sounds like a bug to me. It's probably not the corruption of token counts, but of the total messages trained (at least that was the case ages back when this problem was common). It shouldn't be possible for training that successfully completes to cause this problem - if it does, then yes it's a bug (and if you can figure a way for that to happen, please open a bug tracker on sourceforge and we'll address it). =Tony.Meyer -- Please always include the list (spambayes at python.org) in your replies (reply-all), and please don't send me personal mail about SpamBayes. http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this. _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
