Can't you use robots.txt (or the modern equiv, is there anything newer
actually?) to stop mass indexing, perhaps point it to pages you want indexed
and also tell it to exclude images etc etc?

On Thu, Oct 23, 2008 at 10:45 AM, Peter Chubb <[EMAIL PROTECTED]>wrote:

>
> Hi,
>  I'm a little cheesed off.  In the last three months, people have
> downloaded 9G per month from our website; search engines have
> downloaded 21G per month.  Only Google generated significant traffic
> through search engine hits (and it downloaded less than the others,
> too --- around 2G per month, as opposed to 10G for Yahoo, and 4G for
> MSNbot).  In other words, search engine indexing traffic was double
> the actual traffic from www.gelato.unsw.edu,au.
>
> Is there any good reason why I shouldn't block (or at least
> significantly slow down) MSNbot, MJ12BOT, and Yahoo
> Slurp! ???  Yahoo is particularly bad, crawling and downloading about
> twice what the others do, and yet generating 1% of the hits that
> Google generated for us.
>
>
> --
> Dr Peter Chubb  http://www.gelato.unsw.edu.au  peterc AT
> gelato.unsw.edu.au
> http://www.ertos.nicta.com.au           ERTOS within National ICT
> Australia
> --
> SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
> Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
>
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Reply via email to