On Thu, Nov 7, 2013 at 10:38 AM, Barry Hunter <[email protected]> wrote:
> Well yes, as a webmaster you do have to decide if you willing to 'pay' for > crewlers wanting to access your site. > Its a tradeoff, the 'cost' verses the potential benefit of having your > site included in their search engine. > You could opt to proactivly manage crawler activity, in the first case > with robots.txt, > ... there's no magic 'fix', have to decide on your own priorities. > > +1. If you're having trouble with a robot that respects robots.txt, you can ban the robot or declare a long crawl delay (the number of seconds between page retrievals): http://en.wikipedia.org/wiki/Robots_exclusion_standard#Crawl-delay_directive If the robot doesn't respect robots.txt, you can ban their IP via the blacklist or serve up a cached page. On Thu, Nov 7, 2013 at 10:19 AM, Martin Descours <[email protected]> wrote: > It seems however that a machine was crawling the website. I was overquota > in a few hours. > Look in your request logs. Does this robot declare a custom User Agent? You can try to find the owner of the robot and send them a complaint. ----------------- -Vinny P Technology & Media Advisor Chicago, IL App Engine Code Samples: http://www.learntogoogleit.com -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/google-appengine. For more options, visit https://groups.google.com/groups/opt_out.
