On Sun, Feb 14, 2021 at 4:45 PM John Hardin <[email protected]> wrote: > > On Sun, 14 Feb 2021, Ricky Boone wrote: > > > What are the community's thoughts on handling spam/phishing that utilize > > homoglyphs to obfuscate the brands they're targeting? Are there any > > plugins that are in development that might assist with catching these? > > Take a look at the definition of the FUZZY rules. > > There's no general plugin for this currently. That would be a bit > difficult to do on-the-fly without getting (potentially lots of) FPs on > non-English words. > > At the moment it's: > > 1) notice that some word is being obfuscated > 2) add a FUZZY rule for that word > 3) tune it for FPs (may hit legitimate words in non-English, exclude them)
Good to know. I'll check out the FUZZY rules for possible rules in the future. > The problem is such obfuscations may not be common enough in the masscheck > corpora for the rules to be promoted, scored and published. Understood. There may be better rules that could be built with additional context other than just the individual words/phrases. If there is interest in the original messages, I can make sanitized versions available. > > For example, here are some phrases that I've been monitoring from reported > > messages: > > > > * that Âmåzon has received > > * Äpple Watch > > * Ãρρle iPad > > * Aρρle iPad > > * PäyPäl Credit > > * PαyPαl Credit > > * Spãce Gray > > * to Over Støck Inc on > > * subscribed for Nõrtõn Yearly > > * subscribed for Nõrtøn Yearly > > * the Nõrtõn Freedom Protection > > > > Existing rules (mainline SpamAssassin channel, KAM, etc.) don't seem to > > flag much, if anything substantial, on the messages I've seen with this > > behavior. I've trained bayes on each, and created a custom set of rules to > > try to catch various patterns used in the messages. > > I've added FUZZY rules for amazon, apple, microsoft, facebook, paypal and > norton to my sandbox, they are likely going to be fairly commonB. > > How often do you see (over)stock and space obfuscated? So far, 4 times and once, respectively, the latter in context was describing a version of an Apple iPad, so full product names must have been used for the input to whatever homoglyph generating process the spammers were using.
