Ah, thanks for pointing this out!
I did run the robots.txt validator, and it said I had everything done 
correctly, but apparently I don't.
Thanks again!  Fixed it - http://simpy.com/robots.txt

Otis

----- Original Message ----
From: Walter Underwood <[EMAIL PROTECTED]>
To: "Internet robots, spiders, web-walkers, etc." <robots@mccmedia.com>
Sent: Sunday, March 26, 2006 11:56:20 AM
Subject: Re: [Robots] Googlebot, msnbot, and robots.txt refresh

--On March 26, 2006 7:25:42 AM -0800 [EMAIL PROTECTED] wrote:
> 
> Googlebot and msnbot are supposed to obey robots.txt, but they are ignoring
> my robots.txt ( http://simpy.com/robots.txt ), that contains:
> 
> User-agent: *
> Disallow: /simpy/
>
> User-agent: Googlebot
> Disallow: /rss/

You need to fix your robots.txt. Googlebot is dong the right thing.

Your robots.txt file tells Googlebot to stay away from /rss/, but it
does not say anything about /simpy/ (for Googlebot). Here is the spec
text about the meaning of "User-agent: *".

   If the value is '*', the record describes the default access policy
   for any robot that has not matched any of the other records.

In other words, the disallows following "User-agent:" are the entire
policy for one robot. The robot does not merge every matching block.
Under "User-agent: Googlebot" you must add all disallows for that bot.

If you want all robots to stay out of /simpy/, you must add that as a
Disallow line to every block.

Think of it like a switch statement. "User-agent: *" is the default label.
It wouldn't hurt to put that last in the file, just in case some lazy bot
takes the first match.

wunder
--
Walter Underwood
Principal Software Architect, Autonomy
_______________________________________________
Robots mailing list
Robots@mccmedia.com
http://www.mccmedia.com/mailman/listinfo/robots



_______________________________________________
Robots mailing list
Robots@mccmedia.com
http://www.mccmedia.com/mailman/listinfo/robots

Reply via email to