smontanaro wrote: > > > kudret> # ham trained on: 2559 > kudret> # spam trained on: 48 > > Looks very imbalanced. We usually see the imbalance in the other > direction > (lots of spam, few ham), but this far out of whack in either direction > might > present problems. I suggest you clear your database out completely and > start from scratch. Train a couple hams, then a couple spams. Rescore > everything. Train on a couple mistakes or unsures of each type. Rescore > the rest. > > kudret> How is that possible 2 similar token list, and one of them > gets > kudret> %45, the other is %0 ? > > So many hammy tokens in the second message outweigh the few spammy tokens. > In the first message the relative number of hammy and spammy tokens is > more > balanced, thus the overall score is nearer to the middle. >
I'm not keeping my spam emails (I don't see any point doing this), permanently delete all of them. When I cleared the database and start training from scratch this imbalanced situation occured. It's certainly not the problem here. Besides, my 2nd question was "how can get a spammer list?" In another words, instead of waiting for weeks to build a spam database, there has to be a table somewhere to download and save a lot of time. If you look at my 2 sample reports, both doesn't have any problem with original email part. At the bottom, I forgot to mention this, you will see AVG virus check report. This part somehow shows some spam tokens. First of all, this should not be considered as spam and 2nd both reports look the same, should not be so different results. And 3rd, like I said at the beginning, if my friend sends me 2 emails in same day. When I check my emails, they are going to junk folder. This does not make any sense. -- View this message in context: http://www.nabble.com/Training-problem-tf4365445.html#a12446072 Sent from the Spambayes - General mailing list archive at Nabble.com. _______________________________________________ SpamBayes@python.org http://mail.python.org/mailman/listinfo/spambayes Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html