Within the last 24 hours or so, the googlebox has gone mad. It admits to downloading 822MB from my (free) appengine instance. I suspect that it ran it out of quota at that point. Today, by 7:30 AM, it had used 680MB of my 1GB quota.
You can't change the crawl rate in webmaster tools, and I don't want to prevent the bot from grabbing the documents (but at a reasonable rate). I understand that the googlebot ignores the rate controls in the robots.txt. Any ideas? [I've thought of changing the robots.txt dynamically so that the googlebot can only whack my site at the end of the day, but I fear that it might start removing documents from the index] [I've thought of returning a 307 error most of the time to make it go away temporarily, but I'm nervous that all these requests will come back to haunt me in the future!] Help!! Philip -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
