> On the front page of todays's press there is a story of a spam
> merchant who lives in CHC, basically lauding his rise to riches.

Hopefully The Press will follow it up by plenty of coverage of the
result of the actions by these rotters.

> has anyone got bayseian filtering going on their mail server

spamassassin can be hooked into MTAs. I only use it on client filtering.
It is supposed to be *the* state of the art spam filter.

Btw I think all this bayesian stuff is overrated. It's not the golden
bullet. Read the instructions for the bayesian filter in spamassasssin
first. It says "must be trained to be effective", "must be trained with
thousands of hand-sorted emails, both spam and non-spam", and "training
with a huge number of spam and only a few non-spam is not unlikely to
have the opposite effect", read make spam detection rates worse. What
do you do with emails which contain some header rubbish and an empty
body? How does it trip up the training, if so? Should you run mailing
list emails through the spam filter (their headers show signs of
automated mass mailouts)?

You could of course try and find a better bayesian filter. Let us know
when you have. Without having done it myself, I'm sure google has plenty
of discussions on why none of the techniques really always work. This
was also the conclusion of the article(s) recently in New Scientist.
(My personal take is that something will only happen once the Americans
change their laws such that you can sue spammers' balls off, or else
huge corporates are getting past some iration threshold *and* find a
means to combat the problem effectively.)

I challenge anyone who claims 100% recognition rate to either not know
what (s)he is talking about, concealing some relevant details, or not to
know what a spam problem is. I've run some deluge of spam through
the latest stable spamassassin (2.55) for the past few weeks, and it is
obvious that some of the missed spams are very difficult to detect with
any automated method, or are specifically crafted to score low on
spamassassin. You would have that problem with any well-known spam
filter, and the not well-known ones are probably not worth using. You
also need to be careful with your regexp or else you end up running a
DoS on your mail server, it's my guess that this does reduce the
recognition rate. There are Nigerian spams which score very low on
spamassassin (I've got at least one somewhere).

Volker

-- 
Volker Kuhlmann                 is possibly list0570 with the domain in header
http://volker.dnsalias.net/             Please do not CC list postings to me.

Reply via email to