On Thu, 2006-09-28 at 08:02 +1200, Volker Kuhlmann wrote:
> > few searches relating to this sort of thing, but you need to log in to 
> > read the forums.
> 
> Against google's terms and might get them kicked off the index. (Does
> google cache wemasterworld...?)

Yes, Google have webmasterworld (Again, just an example, there are lots
of sites like this).  This isn't against Google's terms, they almost
encourage people to make their sites 'spider friendly' and having
'referrer gates' for spiders is part of that.

Do you use webmasterworld?  if not, search for "fake googlebot stats
website" on google.  Third hit down is webmastersworld, click on it and
you'll be asked for a username/password when you try to read a thread.
In the meta text on google though is an excerpt from the forum text..

> Might as well dump all traffic from googlebot that isn't coming from
> google's IP range (which is a size C for the bot). Why bother with
> imposters?

That's what a lot of people are doing now, there are some published
reverse DNS blackholes for this, and at least one site listing them for
manual entry into a .htaccess file.

> Anyone else finding that google DoSes servers on a regular basis?
> Downloading the same pdf as fast as they can. If it's only 30 times it's
> lucky, bringing the net plan over quota is not unheard of.

I've never seen googlebot do this, but inktomi slurp racked up 1Gb of
traffic to one of my sites in 3 days.  I put an exclude in the
robots.txt for it.

Cheers, Chris Hellyar.

Reply via email to