Re: No longer just embedded =9D characters in blackmail emails.

2019-03-22 Thread Savvas Karagiannidis
On 21/3/2019 18:23, John Hardin wrote: On Thu, 21 Mar 2019, Savvas Karagiannidis wrote: What should be considered is the message's language. All messages that were false positives had the following mime encoding (messages were actually in greek): Content-Type: text/[plain|html];

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread Fedor Piecka
Hello Bill I can show a few messages triggering the rule in our case but only for you to see the use of accented characters in Czech language. I'm unable to grant you a permission to upload them to masscheck corpus or to any other public/semipublic database. The messages contain no classified

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread @lbutlr
On 21 Mar 2019, at 14:27, John Hardin wrote: > On Thu, 21 Mar 2019, Martin Gregorie wrote: >> On Thu, 2019-03-21 at 12:20 -0700, John Hardin wrote: >>> >>> ...wrong thread? :) >> Unfortunately so. For some reason my mail reader's editor (I use >> Evolution) locked up on my first attempt to

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread John Hardin
On Thu, 21 Mar 2019, Martin Gregorie wrote: On Thu, 2019-03-21 at 12:20 -0700, John Hardin wrote: ...wrong thread? :) Unfortunately so. For some reason my mail reader's editor (I use Evolution) locked up on my first attempt to reply and when I got it to respond it again it sent the stupid

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread Martin Gregorie
On Thu, 2019-03-21 at 12:20 -0700, John Hardin wrote: > > ...wrong thread? :) > Unfortunately so. For some reason my mail reader's editor (I use Evolution) locked up on my first attempt to reply and when I got it to respond it again it sent the stupid message containing one blank line. Then I

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread John Hardin
On Thu, 21 Mar 2019, Martin Gregorie wrote: On Thu, 2019-03-21 at 09:23 -0700, John Hardin wrote: On Thu, 21 Mar 2019, Savvas Karagiannidis wrote: What should be considered is the message's language. All messages that were false positives had the following mime encoding (messages were

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread RW
On Thu, 6 Dec 2018 09:15:59 -0800 (PST) John Hardin wrote: > On Wed, 5 Dec 2018, Grant Taylor wrote: > > Would __UNICODE_TEST_FR run / consume resources even if __LANG_FR > > evaluates to false? > > Yes, all the subrules get evaluated. There's no shortcutting because > a subrule may be used

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread Martin Gregorie
On Thu, 2019-03-21 at 09:23 -0700, John Hardin wrote: > On Thu, 21 Mar 2019, Savvas Karagiannidis wrote: > > > What should be considered is the message's language. All messages > > that were > > false positives had the following mime encoding (messages were > > actually in > > greek): > > > >

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread John Hardin
On Thu, 21 Mar 2019, Savvas Karagiannidis wrote: What should be considered is the message's language. All messages that were false positives had the following mime encoding (messages were actually in greek): Content-Type: text/[plain|html]; charset="windows-1253" or Content-Type:

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread Bill Cole
On 21 Mar 2019, at 10:52, John Wilcock wrote: Le 21/03/2019 à 14:52, John Wilcock a écrit : Le 20/03/2019 à 20:19, Bill Cole a écrit : I've added these lines to the block that defines MIXED_ES which may help some sites: lang pl  score MIXED_ES  0.01 lang cz  score MIXED_ES  0.01   

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread Savvas Karagiannidis
Hi all, I'd like to thank you Bill for looking into this. I was a bit disappointed by the way the issue was handled at first on bugzilla. I must agree that the server's locale could be information to be considered but I don't think it solves the issue. I agree that this test is effective on

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread John Wilcock
Le 21/03/2019 à 14:52, John Wilcock a écrit : Le 20/03/2019 à 20:19, Bill Cole a écrit : I've added these lines to the block that defines MIXED_ES which may help some sites: lang pl  score MIXED_ES  0.01 lang cz  score MIXED_ES  0.01 lang sk  score MIXED_ES  0.01 lang hr 

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread John Wilcock
Le 20/03/2019 à 20:19, Bill Cole a écrit : I've added these lines to the block that defines MIXED_ES which may help some sites:     lang pl  score MIXED_ES  0.01     lang cz  score MIXED_ES  0.01     lang sk  score MIXED_ES  0.01     lang hr  score MIXED_ES  0.01     lang el  score

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-21 Thread Bill Cole
On 20 Mar 2019, at 18:26, Benny Pedersen wrote: Bill Cole skrev den 2019-03-20 20:19: lang pl score MIXED_ES 0.01 lang cz score MIXED_ES 0.01 lang sk score MIXED_ES 0.01 lang hr score MIXED_ES 0.01 lang el score MIXED_ES 0.01 Those should get into the default

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-20 Thread Benny Pedersen
Bill Cole skrev den 2019-03-20 20:19: lang pl score MIXED_ES 0.01 lang cz score MIXED_ES 0.01 lang sk score MIXED_ES 0.01 lang hr score MIXED_ES 0.01 lang el score MIXED_ES 0.01 Those should get into the default rules channel within a few days. is lang supporing

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-20 Thread Bill Cole
On 20 Mar 2019, at 9:04, piecka wrote: Hello We've encountered a high false positive rate with MIXED_ES rule for emails written in Czech language. Czech naturally uses all of the e,ě and é. The situation is similar for Slovak language, which includes e and é. It seems the same with Greek

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-20 Thread Grant Taylor
On 3/20/19 7:04 AM, piecka wrote: We've encountered a high false positive rate with MIXED_ES rule for emails written in Czech language … Slovak … Greek … Do the MIME headers have any indication of the language? Can you use create a __test rule that is then used in a meta rule with MIXED_ES?

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-20 Thread Marcin Mirosław
W dniu 20.03.2019 o 15:27, Dominic Raferd pisze: > On Wed, 20 Mar 2019 at 13:14, piecka wrote: >> >> Hello >> >> We've encountered a high false positive rate with MIXED_ES rule for emails >> written in Czech language. Czech naturally uses all of the e,ě and é. >> >> The situation is similar for

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-20 Thread Dominic Raferd
On Wed, 20 Mar 2019 at 13:14, piecka wrote: > > Hello > > We've encountered a high false positive rate with MIXED_ES rule for emails > written in Czech language. Czech naturally uses all of the e,ě and é. > > The situation is similar for Slovak language, which includes e and é. > > It seems the

Re: No longer just embedded =9D characters in blackmail emails.

2019-03-20 Thread piecka
Hello We've encountered a high false positive rate with MIXED_ES rule for emails written in Czech language. Czech naturally uses all of the e,ě and é. The situation is similar for Slovak language, which includes e and é. It seems the same with Greek

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-06 Thread John Hardin
On Wed, 5 Dec 2018, Grant Taylor wrote: On 12/5/18 5:43 PM, John Hardin wrote: Potentially, but it's hard to use something like that in regular rule REs. That sort of smarts would probably need to be in a plugin. Maybe (from my naive point of view) if not probably (from your more

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread Bill Cole
On 5 Dec 2018, at 22:29, Grant Taylor wrote: > On 12/5/18 7:55 PM, Bill Cole wrote: >> Yes. There is no automatic 'shortcircuiting' of rules. > > Okay. > > You say "automatic". Is there a "non-automatic" way? :-) perldoc Mail::SpamAssassin::Plugin::Shortcircuit -- Bill Cole signature.asc

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread Grant Taylor
On 12/5/18 7:55 PM, Bill Cole wrote: Yes. There is no automatic 'shortcircuiting' of rules. Okay. You say "automatic". Is there a "non-automatic" way? :-) -- Grant. . . . unix || die smime.p7s Description: S/MIME Cryptographic Signature

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread Bill Cole
On 5 Dec 2018, at 20:37, Grant Taylor wrote: > On 12/5/18 5:43 PM, John Hardin wrote: >> Potentially, but it's hard to use something like that in regular rule REs. >> That sort of smarts would probably need to be in a plugin. > > Maybe (from my naive point of view) if not probably (from your

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread Grant Taylor
On 12/5/18 5:43 PM, John Hardin wrote: Potentially, but it's hard to use something like that in regular rule REs. That sort of smarts would probably need to be in a plugin. Maybe (from my naive point of view) if not probably (from your more experienced point of view). I would think that it

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread John Hardin
On Wed, 5 Dec 2018, Grant Taylor wrote: On 12/05/2018 03:27 PM, John Hardin wrote: Take a look at replace_rules in the repo (both standard and sandboxes). Thank you for the reference. replace_rules look very intriguing. Link - Mail::SpamAssassin::Plugin::ReplaceTags - tags for SpamAssassin

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread Grant Taylor
On 12/05/2018 03:27 PM, John Hardin wrote: Take a look at replace_rules in the repo (both standard and sandboxes). Thank you for the reference. replace_rules look very intriguing. Link - Mail::SpamAssassin::Plugin::ReplaceTags - tags for SpamAssassin rules -

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread John Hardin
On Wed, 5 Dec 2018, Bill Cole wrote: On 5 Dec 2018, at 16:45, John Hardin wrote: Those aren't zero-width, those are just standard Unicode obfuscations of regular ASCII text. Not precisely. In this case they seem to all be Cyrillic characters which happen to look like Latin characters that

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread Bill Cole
On 5 Dec 2018, at 16:45, John Hardin wrote: Those aren't zero-width, those are just standard Unicode obfuscations of regular ASCII text. Not precisely. In this case they seem to all be Cyrillic characters which happen to look like Latin characters that have ASCII encodings. It's not

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread John Hardin
On Wed, 5 Dec 2018, Grant Taylor wrote: On 12/05/2018 02:45 PM, John Hardin wrote: I've added a "too many [ascii][unicode][ascii]" rule based on that but I suspect it will be pretty FP-prone and will be pretty large if we want to avoid whack-a-mole syndrome. For this, normalize + bayes is

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread Kevin A. McGrail
On 12/5/2018 4:50 PM, Grant Taylor wrote: > On 12/05/2018 02:45 PM, John Hardin wrote: >> I've added a "too many [ascii][unicode][ascii]" rule based on that >> but I suspect it will be pretty FP-prone and will be pretty large if >> we want to avoid whack-a-mole syndrome. For this, normalize +

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread Grant Taylor
On 12/05/2018 02:45 PM, John Hardin wrote: I've added a "too many [ascii][unicode][ascii]" rule based on that but I suspect it will be pretty FP-prone and will be pretty large if we want to avoid whack-a-mole syndrome. For this, normalize + bayes is probably the best bet. Is it possible to

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread John Hardin
On Wed, 5 Dec 2018, Mark London wrote: No longer just embedded =9D characters. From: =?utf-8?B?bmlnaHRt0LByZQ==?= To: Subject: You are my victim. Date: Tue, 4 Dec 2018 15:56:36 -0800 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="a0d0993ce53319101c19af03d5311b0976b26b"

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread Bill Cole
On 5 Dec 2018, at 11:45, Mark London wrote: The __UNICODE_OBFU_ZW rule is not being triggered on this email. Maybe it needs updating? - Mark FWIW, I just added a "MIXED_ES" rule to my sandbox which does catch on anything with a suspiciously large number of characters that are visually like

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread John Hardin
On Wed, 5 Dec 2018, Mark London wrote: The __UNICODE_OBFU_ZW rule is not being triggered on this email. Maybe it needs updating? - Mark Will do, I don't have a zero response time as much as I wish I did... :) -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/

Re: No longer just embedded =9D characters in blackmail emails.

2018-12-05 Thread Mark London
The __UNICODE_OBFU_ZW rule is not being triggered on this email. Maybe it needs updating? - Mark On 12/5/2018 11:19 AM, Mark London wrote: No longer just embedded =9D characters. From: =?utf-8?B?bmlnaHRt0LByZQ==?= To: Subject: You are my victim. Date: Tue, 4 Dec 2018 15:56:36 -0800