On 20 April 2015 at 12:28, Ashutosh Mishra <[email protected]> wrote:
> y I am annoyed by the hitting of Spam Bot mainly Ahref bot. > The Ahref bot (if its the legitimate one of course!) definitly obays the robots.txt https://ahrefs.com/robot Looking at http://www.myhotelcar.com/robots.txt there is nothing blocking that particular bot. But a number of other oddities. The crawl delay will only apply to the * group, which is disallowed from any crawling, meaning it has no effect. The 'directories' rules, will only apply the group they placed in (so only affect MJ12bot/v1.4.5 - which is blocked completely by the first rule) > > I don't find any way to stop them as I didn't see .HTaccess file in Google > App Engine > > Not as such. You would need to handle any such directives directly in code. ie your javas handlers, could check the User-Agent and do 'stuff' selectively. Tehre is also https://cloud.google.com/appengine/docs/java/config/dos but its utility for this is limited. (unless you can identify specific IP/ranges to block) -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/CAJCAUuJ_-%3DkcqUCaAcnJKecBGtqAkyHR_3kah1D4HZpHHD7pxg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
