On Mon, Apr 20, 2015 at 11:32 PM, Ashutosh Mishra <
[email protected]> wrote:
>
> I have also searched so many thing and I found the Ahref bot doesn't obey
> robots principal.
> Many people has suggested that I can prohibit them via htaccess file, I
> don't want to use that way as in google app engine hosting I didn't find
> htaccess file. So please provide me any way to filter out these spam bots.
>


The .htaccess file isn't supported in App Engine.

If this is the real Ahref bot, it should support robots.txt. I looked in
your robots.txt file: I see you disallowing Baidu, Yandex and a wildcard
disallow, but not specifically ahrefbot. Try adding the following to your
robots file:

*user-agent: AhrefsBot*
*disallow: /*

According to the ahrefbot robot page, you can also email them directly to
ask them to stop; see https://ahrefs.com/robot


On Mon, Apr 20, 2015 at 11:36 PM, Ashutosh Mishra <
[email protected]> wrote:

> I think you have picked the issue correctly they are hitting particular
> set of pages regularly hotel pages which were dynamically generated, you
> are correct about rss and sitemap feed.
> So please tell me the way to overcome this issue as these spam bots
> specially ahref bot is consuming my server bandwidth a lot un-necessarily.
> I want a good solution so that I will not face any spam bot hurdle in
> future.
>


This happens to a lot of websites with a large set of dynamically generated
pages.

Honestly the best solution would be to sign up for Cloudflare (
https://www.cloudflare.com/google ) and use their tools to help filter
incoming traffic. You can also do what Barry suggested earlier, and start
blocking the IPs that ahrefsbot is using.

If you're willing to do some coding, you can write a filter into your
application to check for the useragent and kick back a 429 HTTP status code
(Too Many Requests) if traffic is too high:
http://tools.ietf.org/html/rfc6585#page-3



-----------------
-Vinny P
Technology & Media Consultant
Chicago, IL

App Engine Code Samples: http://www.learntogoogleit.com

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/CALSvALCfCFy8nrjyX5j3YAYSmJLwJY%2B2JHfJta12K6hGgf6Tow%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to