I was recently advised here on this group to enable the Bayesian Analysis mailet on the James server in order to help control some spam that is getting created and sent to some of my maillists that I sponsor. I have tried to understand and follow the documentation on the James wiki site but so far not been able to get it set up an running properly. A couple of points in particular has me confused -

1. on the wiki page it says -

"It is a good idea to activate SMTP AUTH and replace thisdomain.com with a domain not listed as a server in <servernames> in config.xml: this way only authenticated users can feed the corpus. An example of addresses to use could be "[MAILTO] [EMAIL PROTECTED]" and "[MAILTO] [EMAIL PROTECTED]". "

My server is already set up to use SMTP AUTH. I have a single domain name that I have purchased from Network Solutions, lets call it mydomain.com. I have listed mydomain.com in the <servernames> section of the James config.xml file. I want this server to service both internal and external users. So what exactly is this suggestion asking me to do? Do I need to purchase another domain name in order to run this Bayesian Analysis mailet? That does not make sense to me...

I presume (guess) that I can use a more qualified URL such as what I did which seems, at first glance, to have worked. Here I preceded my domain with the name of the machine on which I am running the James server. <mailet match="[EMAIL PROTECTED]" class="BayesianAnalysisFeeder">
           <repositoryPath> db://maildb </repositoryPath>
           <feedType>spam</feedType>
           <maxSize>200000</maxSize>
</mailet>

2. I am not sure I fully understand the concept of having both a spam and a ham feedback to the Bayesian Analyzer. Spam I can understand, that is used to teach the analyzer what is spam. But why have a ham feedback? Do the users have to teach the analyzer what is good email also??? That seems like an extraordinary burden to place on them.

3. That last question is related to this next one because I haven't got this working yet. So I don't yet know what to expect fully when I do get it working. I pretty much set up my config.xml with as little change as possible. I simply uncommented the two bayesian analysis feeder mailets and modified the RecipientIs parameters as described above. Then I uncommented the four bayesian analysis mailets and left them as is. The log files showed that a bunch of tables got created in the mysql database OK. Next I went to my Junk folder in my email client and feed a few pieces of already collected spam back to the bayesian analysis feeder address as mail attachments, just to get it started. James seems quite happy to accept them and nothing bounced so I figured it was working. I left the mail server running to see how it would behave and the trouble is James ate EVERYTHING that came in from outside senders, but internal users could send email to each other OK. So we seem to have lost a whole lot of email today and I had to turn the Bayesian analyzer off.

So what did I do wrong? Doesn't seem to have worked too well 'out of the box'! The documentation seems to be unclear on some of this or just plain missing, the only thing I could find was on the wiki pages. Nothing in the main documentation. Another question - What does James do with the email that it filters out with the Bayesian analyzer? I looked in the spam folder in the mail database and nothing was there. Nor did the postmaster address receive anything? Do these emails simply go to /dev/null? Is there some kind of summary email sent to the users so that they can verify/retrieve email is necessary? (perhaps that is the purpose of the ham feedback? I am guessing.....)

Hopefully someone will help walk me out of these woods, I am kinda lost.. Thanks in advance...

  Marc...








---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to