Sorry, Xyliu; I actually don't have it split out in a way that
would be easy to send you.
However, NIST publishes their corpus; the most recent one is 93,000
messages with adjucated statuses. You might want to get that and
use that. As another advantage, it's a published corpus so any
results you get are directly comparable.
- Bill Yerazunis
_______________________________________________
[email protected]
http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html