kudret> # ham trained on: 2559
    kudret> # spam trained on: 48

Looks very imbalanced.  We usually see the imbalance in the other direction
(lots of spam, few ham), but this far out of whack in either direction might
present problems.  I suggest you clear your database out completely and
start from scratch.  Train a couple hams, then a couple spams.  Rescore
everything.  Train on a couple mistakes or unsures of each type.  Rescore
the rest.

    kudret> How is that possible 2 similar token list, and one of them gets
    kudret> %45, the other is %0 ?

So many hammy tokens in the second message outweigh the few spammy tokens.
In the first message the relative number of hammy and spammy tokens is more
balanced, thus the overall score is nearer to the middle.

Skip
_______________________________________________
SpamBayes@python.org
http://mail.python.org/mailman/listinfo/spambayes
Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

Reply via email to