Using the Bayesian Analysis mailet

Marc Chamberlin Sat, 20 Sep 2008 00:13:18 -0700

I was recently advised here on this group to enable the BayesianAnalysis mailet on the James server in order to help control some spamthat is getting created and sent to some of my maillists that I sponsor.I have tried to understand and follow the documentation on the Jameswiki site but so far not been able to get it set up an running properly.A couple of points in particular has me confused -


1. on the wiki page it says -

"It is a good idea to activate SMTP AUTH and replace thisdomain.com witha domain not listed as a server in <servernames> in config.xml: this wayonly authenticated users can feed the corpus. An example of addresses touse could be "[MAILTO] [EMAIL PROTECTED]" and "[MAILTO] [EMAIL PROTECTED]". "

My server is already set up to use SMTP AUTH. I have a single domainname that I have purchased from Network Solutions, lets call itmydomain.com. I have listed mydomain.com in the <servernames> section ofthe James config.xml file. I want this server to service both internaland external users. So what exactly is this suggestion asking me to do?Do I need to purchase another domain name in order to run this BayesianAnalysis mailet? That does not make sense to me...

I presume (guess) that I can use a more qualified URL such as what I didwhich seems, at first glance, to have worked. Here I preceded my domainwith the name of the machine on which I am running the James server.<mailet match="[EMAIL PROTECTED]"class="BayesianAnalysisFeeder">

           <repositoryPath> db://maildb </repositoryPath>
           <feedType>spam</feedType>
           <maxSize>200000</maxSize>
</mailet>

2. I am not sure I fully understand the concept of having both a spamand a ham feedback to the Bayesian Analyzer. Spam I can understand, thatis used to teach the analyzer what is spam. But why have a ham feedback?Do the users have to teach the analyzer what is good email also??? Thatseems like an extraordinary burden to place on them.

3. That last question is related to this next one because I haven't gotthis working yet. So I don't yet know what to expect fully when I do getit working. I pretty much set up my config.xml with as little change aspossible. I simply uncommented the two bayesian analysis feeder mailetsand modified the RecipientIs parameters as described above. Then Iuncommented the four bayesian analysis mailets and left them as is. Thelog files showed that a bunch of tables got created in the mysqldatabase OK. Next I went to my Junk folder in my email client and feeda few pieces of already collected spam back to the bayesian analysisfeeder address as mail attachments, just to get it started. James seemsquite happy to accept them and nothing bounced so I figured it wasworking. I left the mail server running to see how it would behave andthe trouble is James ate EVERYTHING that came in from outside senders,but internal users could send email to each other OK. So we seem to havelost a whole lot of email today and I had to turn the Bayesian analyzer off.

So what did I do wrong? Doesn't seem to have worked too well 'out of thebox'! The documentation seems to be unclear on some of this or justplain missing, the only thing I could find was on the wiki pages.Nothing in the main documentation. Another question - What does James dowith the email that it filters out with the Bayesian analyzer? I lookedin the spam folder in the mail database and nothing was there. Nor didthe postmaster address receive anything? Do these emails simply go to/dev/null? Is there some kind of summary email sent to the users so thatthey can verify/retrieve email is necessary? (perhaps that is thepurpose of the ham feedback? I am guessing.....)

Hopefully someone will help walk me out of these woods, I am kindalost.. Thanks in advance...


  Marc...








---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Using the Bayesian Analysis mailet

Reply via email to