Re: [google-appengine] Is it normal to pay for crawlers (overquota)

Vinny P Thu, 07 Nov 2013 18:33:52 -0800

On Thu, Nov 7, 2013 at 10:38 AM, Barry Hunter <[email protected]>
 wrote:


> Well yes, as a webmaster you do have to decide if you willing to 'pay' for
> crewlers wanting to access your site.
> Its a tradeoff, the 'cost' verses the potential benefit of having your
> site included in their search engine.
> You could opt to proactivly manage crawler activity, in the first case
> with robots.txt,
> ... there's no magic 'fix', have to decide on your own priorities.
>
>

+1.

If you're having trouble with a robot that respects robots.txt, you can ban
the robot or declare a long crawl delay (the number of seconds between page
retrievals):
http://en.wikipedia.org/wiki/Robots_exclusion_standard#Crawl-delay_directive

If the robot doesn't respect robots.txt, you can ban their IP via the
blacklist or serve up a cached page.


On Thu, Nov 7, 2013 at 10:19 AM, Martin Descours <[email protected]>
 wrote:

> It seems however that a machine was crawling the website. I was overquota
> in a few hours.
>


Look in your request logs. Does this robot declare a custom User Agent? You
can try to find the owner of the robot and send them a complaint.


-----------------
-Vinny P
Technology & Media Advisor
Chicago, IL

App Engine Code Samples: http://www.learntogoogleit.com

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/google-appengine.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [google-appengine] Is it normal to pay for crawlers (overquota)

Reply via email to