On Thu, October 12, 2006 19:29, Calvin Veltman said: > Hi, > > I've been getting lots of supposed scoops on cheap stocks (19 cents) which > are supposed to skyrocket tomorrow. Spambayes had not been able to filter > these out although it has done wonders with Viagra, cheap drugs, etc. So I > decided to reconfigure the software by accumlating hundreds of Emails with > this cheap stock junk included. After retraining, it has still not been > able > to detect the junk stock emails. Any suggestions?
I am a native Dutch speaker so 99% of emails I get in English are spam, while 99.999% Dutch emails are ham. That could also explain why spambayes catches the junk stocks for me. Are they text-only, html or image mails? I have seen a lot of cheap stock spam lately, with inline images and no text at all. There isn't much to train on... spambayes can't read images. I have explored the possibility of ocr-ing every incoming image so spambayes can generate more trainable tokens, but it doesn't seem worth the effort or the overhead. (see mailing list archive a couple of months ago) If you're good with linux shell scripting/procmail/python, you can try some experiments yourself. -- Amedee Van Gasse _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
