Re: No longer just embedded =9D characters in blackmail emails.

Savvas Karagiannidis Thu, 21 Mar 2019 08:29:17 -0700

Hi all,

I'd like to thank you Bill for looking into this. I was a bitdisappointed by the way the issue was handled at first on bugzilla.

I must agree that the server's locale could be information to beconsidered but I don't think it solves the issue. I agree that this testis effective on catching the type of spam it was intended for. I found anumber of spam messages caught by this while investigating the issue.

What should be considered is the message's language. All messages thatwere false positives had the following mime encoding (messages wereactually in greek):


Content-Type: text/[plain|html]; charset="windows-1253" or
Content-Type: text/[plain|html]; charset="iso-8859-7"

while all messages that were actual spam and were properly detected had:

Content-Type: text/[plain|html]; charset="utf-8"

I'm afraid I cannot provide any sample of the false positives at the moment.

Hope the above helps. Spamassassin is a great project and we are tryingto help improve it


--

Savvas Karagiannidis

On 21/3/2019 16:52, John Wilcock wrote:

Le 21/03/2019 à 14:52, John Wilcock a écrit :
Le 20/03/2019 à 20:19, Bill Cole a écrit :
I've added these lines to the block that defines MIXED_ES which mayhelp some sites:
     lang pl  score MIXED_ES  0.01
     lang cz  score MIXED_ES  0.01
     lang sk  score MIXED_ES  0.01
     lang hr  score MIXED_ES  0.01
     lang el  score MIXED_ES  0.01

Those should get into the default rules channel within a few days.
All very well, except [...]
Also, there are *lots* of other languages that legitimately use E-likecharacters that should be added to the list (e.g. there's a Cyrillic"е", so you can add ru, bg, uk, be, bs, sr, kk, ky, mn, tg and others,for a start; ). You'll be fighting a losing battle there...

Re: No longer just embedded =9D characters in blackmail emails.

Reply via email to