Hello,

Be careful with the character-set matching rules. I was using some of them and 
got a high rate of FP's - it was mainly because of the koi8-r charset, and 
scoring against that meant I was also scoring against perfectly legitimate 
technical resource newsletters that are in English.

Cheers,
Mike


-----Original Message-----
From: Ned Slider [mailto:n...@unixmail.co.uk] 
Sent: Thursday, 15 January 2009 2:04 p.m.
To: users@spamassassin.apache.org
Subject: Re: Russian spam

Francis Russell wrote:
> Anyone know of any good rule-sets to block this sort of spam?
> 
> http://www.unchartedbackwaters.co.uk/files/russian_spam.txt
> 
> I find that Pyzor and Razor completely miss it as well as the DNS
> blacklists (although I believe this one has a relay in one of the
> Spamhaus ones now). I'm aware of the language whitelisting feature but
> presumably there is a better way then just assuming everything in
> language x is spam?
> 
> Francis
> 

If you want something that's language specific, checking for koi8-r can 
be quite effective, but if you do receive legitimate Russian mail then 
it may lead to FPs. Anyway, here's a rule to check the subject that 
would hit your example:

header          LOCAL_CHARSET_SUBJECT   Subject:raw =~ 
/\=\?(koi8-r|windows-1251|iso-2022-jp|gb2312)\?/i

There's a few other foreign character sets  thrown in there that I also 
reject - edit to suit your needs.

Looking at the rest of the mail, I have a few other custom rules that 
fire on your example:


header          LOCAL_THEBAT_MUA        X-Mailer =~ /^The Bat!/

uri             LOCAL_URI_RU            m{https?://.{1,40}\.ru\b}
uri             LOCAL_URI_CHAT_RU       m{https?://.{1,40}\.chat\.ru\b}

I score against The Bat MUA, and also against any [dot] ru domains, plus 
an additional (additive) score for [dot] chat [dot] ru  URIs. I have no 
legitimate use for these in emails (I also have a similar rule for 
Chinese domains that's very popular!)

So I have 4 or 5 custom rules that all score against your example and 
add a little to the score taking it well over the spam threshold.




Reply via email to