I'm new to all of this and I'm not sure if training with sa-learn is having any effect as this SPAM still scores the same and bayes thinks it's probably less than 1% SPAM (BAYES_00). I'm run a small vanity domain for friends and family so there isn't exactly a ton of training going on, but I'm sure I'm doing it right as most Bayes is 95-99% for legitimate SPAM, and 0-5% for HAM. I only training on mail I've personally made sure is HAM and SPAM, and in fact, these e-mails are the only 1% probability I get for legitimate SPAM.

I've attached an example below. There is an HTML component as well, but other than markup it is idential. My thinking is there should be some way to write a rule checking words against a dictionary, but it sounds like an expensive filter process-wise. This poor user gets about 10 of these mails a day.

---------BODY----------------
http://groups.yahoo.com/group/ayazpahlmu/message/Chat/220686/

Ulti mate ly Ab ou t Per ce nt Of Ind ivi dual Re turn s Qual ifie d Fo r e fu nd s Last Ye ar Tot alin g Abou t Bil lion Th e Re fu nds Aver ge d Ab out The Sa me Am ou nt.





Reply via email to