Is there any way to remove words or word pairs from the Bayesian database?
If you look at the below results of a job application email that I
received and analyzed, you will see that none of the words really should
be indicating that it is spam. As a matter of fact, if an email comes
through addressed to our online email address
info_at_jollyfarmer_dot_com, the probability of that being bad is over
98% (see the first line of the analysis)
The text of the email was:
"At this opportunity, I would to apply for this position or any other
suitable with my qualification
I hope, I am the right person who you are looking for and will be
appreciate if you could allow me to present myself in my mobile phone:
xxx-xxx-xxxx
Thank you
Regards"
*Bad Words* *Bad Prob * *Good Words* *Good Prob*
info jollyfarmer.com 0.9857
appreciate if 0.9662
03 07 0.0339
right person 0.9593
worker ssub 0.9572
ssub worker 0.9572
am the 0.9312
to present 0.9180
rcpt info 0.9119
with my 0.8932
this position 0.8900
person who 0.8858
mobile phone 0.8854
other suitable 0.8285
or any 0.8245
this opportunity 0.8202
you regards 0.7987
to apply 0.7960
allow me 0.7842
in my 0.7824
19 2011 0.2232
apply for 0.7530
Thanks
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Assp-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/assp-user