Re: No longer just embedded =9D characters in blackmail emails.

Grant Taylor Wed, 05 Dec 2018 13:53:03 -0800

On 12/05/2018 02:45 PM, John Hardin wrote:

I've added a "too many [ascii][unicode][ascii]" rule based on that but I suspect it will be pretty FP-prone and will be pretty large if we want to avoid whack-a-mole syndrome. For this, normalize + bayes is probably the best bet.

Is it possible to detect when a Unicode code point is being used in place of an ASCII / ANSI character specifically to avoid pattern detection? I.e. multiple Unicode code points that represent or are otherwise a stand in for an ASCII / ANSI "a"?


Or is keeping up with this list tantamount to whack-a-mole?

I would think that too high of a percentage of Unicode when bog standard ASCII / ANSI would suffice would be an indication in and of itself. I'm not seeing how legitimate (non-spam) email would trigger a false positive if the percentage was tuned correctly.




--
Grant. . . .
unix || die

smime.p7s
Description: S/MIME Cryptographic Signature

Re: No longer just embedded =9D characters in blackmail emails.

Reply via email to