Hi Marc,
Occasionally I do find an email that was not really spam and I have to
do two things with it, send it back to James's nospam mailet, and
forward it on to the user who should have gotten it. This latter step
is a bit of a pain because I usually have to clean up the email (or
forward it as is)..
Is there an easy way to send the original version of an email to the
user which was accidentally marked as spam? I hope I don't have to
keep double checking all this spam, I get thousands daily... How do
you and others handle the spam that the Bayesian filter catches?
Suggestions welcomed, this could get rather tedious for me...
It will get better quite quickly until you reach the 'maintenance'
level. At that point your workload will diminish to the point where you
only have to mark single digits of emails as spam a day. My spam
database has analysed over 16,000 examples of spam in the last 3 years
and I still get flurries where a new flavour of spam slips through. I
guess there are only so many ways to spell that word beginning with 'V'!
After a while I stopped forwarding false positives completely. This
sounds bad, but like you, I didn't fancy manually forwarding messages to
users for the rest of my life. One thing which I found helps
enourmously is to set up the whitelist manager in James. This ensures
that when your users send a message to someone their address is added to
a 'whitelist'. If an email is received where the from address matches
someone in the whitelist then it is let through without spam checking.
Like the Bayesian filter your users can manually add or remove addresses
from the whitelist by sending special emails to the whitelist manager
email address.
Another alternative you could try until your filter is working well
enough is to simply raise the spam score threshold to a really high
figure like 95%. This will make the filter more lenient and reduce the
false positives at the expense of letting more spam through. Be aware
though that the spam score tends to fluctuate wildly. You will tend to
find that most scores are either very very small or very very large and
only a relative few will have a score between 1% to 99%.
If you used my config.xml settings you should find that all emails
passing through your server have a header like this: -
X-MessageIsSpamProbability: 3.455076140932531E-22
You can use this to satisfy your curiosity about any email's score.
Another thing I like to do sometimes to guage how effective the spam
filter is working is to search the daily mailet log file and count the
number of lines containing '%;'. This is because every analysis result
is recorded like the following: -
29/03/08 17:59:41 INFO James.Mailet: BayesianAnalysis:
X-MessageIsSpamProbability: 100%; From: [EMAIL PROTECTED];
Recipient(s): [EMAIL PROTECTED]
So a command like: -
egrep '100%;' mailet-2008-09-27-17-14.log | wc
will give you an indication of how many emails have been rejected that
day. If you do that over several months it gives you an idea how fast
spam levels are rising!
Regards,
David Legg
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]