Hi Marc,

Occasionally I do find an email that was not really spam and I have to do two things with it, send it back to James's nospam mailet, and forward it on to the user who should have gotten it. This latter step is a bit of a pain because I usually have to clean up the email (or forward it as is)..

Is there an easy way to send the original version of an email to the user which was accidentally marked as spam? I hope I don't have to keep double checking all this spam, I get thousands daily... How do you and others handle the spam that the Bayesian filter catches? Suggestions welcomed, this could get rather tedious for me...

It will get better quite quickly until you reach the 'maintenance' level. At that point your workload will diminish to the point where you only have to mark single digits of emails as spam a day. My spam database has analysed over 16,000 examples of spam in the last 3 years and I still get flurries where a new flavour of spam slips through. I guess there are only so many ways to spell that word beginning with 'V'!

After a while I stopped forwarding false positives completely. This sounds bad, but like you, I didn't fancy manually forwarding messages to users for the rest of my life. One thing which I found helps enourmously is to set up the whitelist manager in James. This ensures that when your users send a message to someone their address is added to a 'whitelist'. If an email is received where the from address matches someone in the whitelist then it is let through without spam checking. Like the Bayesian filter your users can manually add or remove addresses from the whitelist by sending special emails to the whitelist manager email address.

Another alternative you could try until your filter is working well enough is to simply raise the spam score threshold to a really high figure like 95%. This will make the filter more lenient and reduce the false positives at the expense of letting more spam through. Be aware though that the spam score tends to fluctuate wildly. You will tend to find that most scores are either very very small or very very large and only a relative few will have a score between 1% to 99%.

If you used my config.xml settings you should find that all emails passing through your server have a header like this: -

 X-MessageIsSpamProbability: 3.455076140932531E-22


You can use this to satisfy your curiosity about any email's score.

Another thing I like to do sometimes to guage how effective the spam filter is working is to search the daily mailet log file and count the number of lines containing '%;'. This is because every analysis result is recorded like the following: -

29/03/08 17:59:41 INFO James.Mailet: BayesianAnalysis: X-MessageIsSpamProbability: 100%; From: [EMAIL PROTECTED]; Recipient(s): [EMAIL PROTECTED]

So a command like: -

 egrep '100%;' mailet-2008-09-27-17-14.log | wc

will give you an indication of how many emails have been rejected that day. If you do that over several months it gives you an idea how fast spam levels are rising!

Regards,
David Legg

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to