Re: googleapis hosted phish
On 15 Nov 2018, at 7:52, RW wrote: On Thu, 15 Nov 2018 01:22:00 -0500 Bill Cole wrote: On 14 Nov 2018, at 20:11, Alex wrote: Where is it getting these long hostname strings from? There's a bunch of garbage HTML using invisible text (font-size: 0) between tiny bits of visible text to break Bayes and/or specific word detection. That particular example is actually html in a text/plain mime section. The mess in the text/plain part is a result of a botched rendering/tag-stripping of the insane text/html part, but yes: the specific misidentified domain name is in the plain part and is a result of a line-breaking artifact inside the rendered HTML. -- Bill Cole
Re: googleapis hosted phish
On Thu, 15 Nov 2018 01:22:00 -0500 Bill Cole wrote: > On 14 Nov 2018, at 20:11, Alex wrote: > > > Where is it getting these long hostname strings from? > > There's a bunch of garbage HTML using invisible text (font-size: 0) > between tiny bits of visible text to break Bayes and/or specific word > detection. That particular example is actually html in a text/plain mime section.
Re: googleapis hosted phish
On 14 Nov 2018, at 20:11, Alex wrote: Where is it getting these long hostname strings from? There's a bunch of garbage HTML using invisible text (font-size: 0) between tiny bits of visible text to break Bayes and/or specific word detection. The overly-thirsty "URI" parser strings this junk together and is seeing .az\b somewhere in it, and picks it up as a domain name. It's noisy in debug output but in this case harmless because what it is seeing includes a hostname that's too long to be a DNS label. FWIW, that junk can be detected with rawbody rules looking for idiosyncratic HTML. I don't publish my local rules which do that sort of thing because they are very useful but very evadable and I suspect that if the precise rules were broadcast, they'd stop being useful in a matter of days. Instead, it would be really good if everyone maintaining their own local rules would take that hint and devise an invisible forest of slightly different rules to catch HTML structures with no legitimate purpose, making it impossible for spammers to get around a single rule published in the default channel or KAM.cf or anything else known to be under spammers' watch. (CAVEAT: For some reason, a lot of opt-in political bulk mail also catches on such rules.) Should we be rethinking whether googleapis.com should be in the DNSBL skip list? I think it may deserve a special rule all its own (with extensive FP shielding) but I suspect that you will never see it in a URIDNSBL that is safe to use, so it would do no good to keep resolving storage.googleapis.com and other such names with short-TTL CNAME records pointing to shorter-TTL A records on a frequent basis only to determine that it will never get listed OR that you're using a URIDNSBL which intends to generate widespread collateral damage. Of course, I could be wrong. You could test how wrong I might be with this: clear_uridnsbl_skip_domain googleapis.com -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Available For Hire: https://linkedin.com/in/billcole
Re: googleapis hosted phish
Hi, > Anyone have any further ideas for blocking these? Google really should > be doing better to prevent these. > > https://pastebin.com/XumEjHc1 I ran this through with debug, and it's producing some weird messages I don't understand: Nov 14 20:05:43.654 [28187] dbg: uridnsbl: domain googleapis.com in skip list, host storage.googleapis.com Nov 14 20:05:43.654 [28187] dbg: check: tagrun - tag URIHOSTS is now ready, value: ARY:[z2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqc xaz2eqcxaz2eqcx.az,example.com] Nov 14 20:05:43.654 [28187] dbg: check: tagrun - tag URIDOMAINS is now ready, value: ARY:[z2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2e qcxaz2eqcxaz2eqcx.az,example.com] Nov 14 20:05:43.654 [28187] dbg: uridnsbl: considering host=z2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcx.az, domain=z2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcxaz2eqcx.az Where is it getting these long hostname strings from? Should we be rethinking whether googleapis.com should be in the DNSBL skip list?