kudret> # ham trained on: 2559 kudret> # spam trained on: 48
Looks very imbalanced. We usually see the imbalance in the other direction (lots of spam, few ham), but this far out of whack in either direction might present problems. I suggest you clear your database out completely and start from scratch. Train a couple hams, then a couple spams. Rescore everything. Train on a couple mistakes or unsures of each type. Rescore the rest. kudret> How is that possible 2 similar token list, and one of them gets kudret> %45, the other is %0 ? So many hammy tokens in the second message outweigh the few spammy tokens. In the first message the relative number of hammy and spammy tokens is more balanced, thus the overall score is nearer to the middle. Skip _______________________________________________ SpamBayes@python.org http://mail.python.org/mailman/listinfo/spambayes Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html