> Sorry, Skip - I don't. And I was surprised just now to see that we > apparently never checked test data files into the Sourceforge source tree > either! > > But it shouldn't matter. SB learns pretty quickly, and it would be better to > use _current_ examples of spam and ham anyway (their characteristics change > over time).
Sure, but constructing a suitable ham/spam corpus from scratch is a non-trivial task, as you no doubt remember. I could start with the collection on mail.python.org, but I suspect I would probably let a personal email or three leak through into what's ostensibly a public database. (SpamBayes has been doing a pretty good job over the years at its original assigned task.) I am looking to insure that a Py3 port of SpamBayes works the same as the Py2 code. Skip _______________________________________________ spambayes-dev mailing list spambayes-dev@python.org https://mail.python.org/mailman/listinfo/spambayes-dev