Re: The "goo.gl" shortner is OUT OF CONTROL (+ invaluement's response)

Rob McEwen Wed, 14 Mar 2018 16:34:05 -0700

On 2/20/2018 9:42 PM, Rob McEwen wrote:

Google might easily start putting captchas in the way or otherwiseconsider such lookups to be abusive and/or mistake them for maliciousbots...

This prediction turned out to be 100% true. Even though others havementioned that they have been able to do high-volume lookups with noproblems... And granted I wasn't implementing a multi-server or multi-iplookup strategy... But I don't think I was doing nearly as many lookupsas others have claimed that they were able to do. I took a batch of55,000 spams that I had collected from the past 4 weeks where thosespams were maliciously using the Google shortener as a way to get theirspam delivered via hiding their spammy domain names from spam filters. Istarted checking those by looking up the redirect from Google'sredirector, but without actually visiting the site that the redirectorwas pointing to. Please note that I was doing the lookups one-at-a-time,not starting the next lookup until the last lookup had completed. Afterabout ONLY 1,400 lookups, ALL of my following lookups started hittingcaptchas. See attached screenshot. Also, other than not sending frommultiple IPs, I was otherwise doing everything correct to make my scriptlook/act like a regular browser.

I'll try spreading it out between multiple IPs in order to try to avoidrate limits... However... This is still cause for concern abouthigh-volume lookups in high production systems... those may have to beimplemented a little more carefully if they're going to do these kind oflookups!

Just because small or medium production systems are able to do this...Or just because somebody went out of their way to get more sophisticatedwith it to get it to work out... doesn't mean that it's going to work inhigh production systems that are trying to use "canned" software orplugins. This is a particular challenge for anti-spam blacklists becausethey typically process a very high volume of spams. Hopefully, therandomness of the ones I process as they come in... will be sufficientlyspread out enough to avoid rate limiting?

It was my hope to start processing these live with my own DNSBL engine,so that I could start blacklisting the domains that they redirect to...In those cases where they were not already blacklisted... Now I'm goingto have to deal with constantly trying to make sure that I'm not hittingthis captcha, along with implementing some other strategies to hopefullyprevent that.

But this brings up a whole other issue... That is more of a policy orlegal issue... is Google basically making a statement that automatedlookups are not welcome? Or are considered abusive?

(btw, I could have collected order of magnitudes more than 55,000 ofTHESE types of spams, but this was merely what was left over in anafter-the-fact search of my archives, after a lot of otherwise redundantspams had already been purged from my system.)

PS - Once I gather this information, I will submit more details aboutthe results of this testing. But what is shocking right now is that lessthan four tenths of 1% of these redirect URLs has been terminated, eventhough they average two weeks old, with some almost a month old.


--
Rob McEwen
https://www.invaluement.com
+1 (478) 475-9032

Re: The "goo.gl" shortner is OUT OF CONTROL (+ invaluement's response)

Reply via email to