If you are using Google Webmaster Tools, you may be aware that you limit your rate of crawling with its help. Although, that feature is not enabled for all sites.
WeatherPhilip wrote: > Within the last 24 hours or so, the googlebox has gone mad. It admits > to downloading 822MB from my (free) appengine instance. I suspect that > it ran it out of quota at that point. Today, by 7:30 AM, it had used > 680MB of my 1GB quota. > > You can't change the crawl rate in webmaster tools, and I don't want > to prevent the bot from grabbing the documents (but at a reasonable > rate). > > I understand that the googlebot ignores the rate controls in the > robots.txt. > > Any ideas? [I've thought of changing the robots.txt dynamically so > that the googlebot can only whack my site at the end of the day, but I > fear that it might start removing documents from the index] [I've > thought of returning a 307 error most of the time to make it go away > temporarily, but I'm nervous that all these requests will come back to > haunt me in the future!] > > Help!! > > Philip -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
