Re: Disallows in robots.txt
> So: > > I was looking at a robots.txt file and it had a series of disallow > instructions for various user agents, and then at the bottom was a full > disallow: [...] > Wouldn't this just disallow everyone from everything? No, it would disallow everyone but a ... d (with the specified restrictions). >From the spec: " The robot must obey the first record in /robots.txt that contains a User- Agent line whose value contains the name token of the robot as a substring. The name comparisons are case-insensitive. If no such record exists, it should obey the first record with a User-agent line with a "*" value, if present. If no record satisfied either condition, or no records are present at all, access is unlimited." Regards, Martin
Re: Disallows in robots.txt
Bert Van Kets: >It even overrides the other disalows. No override... Robots Exclusion Standard on User-agent: "(If the value is '*', the record describes the default access policy for any robot that) has not [yet] matched any of the other records." (http://info.webcrawler.com/mak/projects/robots/norobots.html) Cheers, Tuomas
Re: Disallows in robots.txt
Jonathan Knoll: >User-agent: aa >Disallow: /cgi-bin >Disallow: /stuff >Disallow: /x.html [...] >User-agent: * >Disallow: / >Wouldn't this just disallow everyone from everything? No, the file is perfectly OK... The "*" has a special meaning in the Standard: "every other User-agent (not mentioned yet)". So here aa to ddd have limited access, and all >other< bots are banned. Cheers, Tuomas
Re: Disallows in robots.txt
It even overrides the other disalows. So in this robots.txt nothing is indexed! Bert >>>>>>>>>>>>>>>>>> Original Message <<<<<<<<<<<<<<<<<< On 23/03/2000, 22:19:57, Jonathan Knoll <[EMAIL PROTECTED]> wrote regarding Disallows in robots.txt: > So: > I was looking at a robots.txt file and it had a series of disallow > instructions for various user agents, and then at the bottom was a full > disallow: > - > User-agent: aa > Disallow: /cgi-bin > Disallow: /stuff > Disallow: /x.html > User-agent: bb > Disallow: /cgi-bin > Disallow: /stuff > Disallow: /x.html > User-agent: cc > Disallow: /cgi-bin > Disallow: /stuff > Disallow: /x.html > User-agent: dd > Disallow: /cgi-bin > Disallow: /stuff > Disallow: /x.html > User-agent: * > Disallow: / > -- > Wouldn't this just disallow everyone from everything? > Jonathan S. Knoll > Catalog Analyst > Aeneid Corporation - provider of EoCenter > Try EoCenter at http://www.eocenter.com > 282 Second Street, 4th Floor > San Francisco, CA 94105 > 415. 538.8555 ext. 284
Disallows in robots.txt
So: I was looking at a robots.txt file and it had a series of disallow instructions for various user agents, and then at the bottom was a full disallow: - User-agent: aa Disallow: /cgi-bin Disallow: /stuff Disallow: /x.html User-agent: bb Disallow: /cgi-bin Disallow: /stuff Disallow: /x.html User-agent: cc Disallow: /cgi-bin Disallow: /stuff Disallow: /x.html User-agent: dd Disallow: /cgi-bin Disallow: /stuff Disallow: /x.html User-agent: * Disallow: / -- Wouldn't this just disallow everyone from everything? Jonathan S. Knoll Catalog Analyst Aeneid Corporation - provider of EoCenter Try EoCenter at http://www.eocenter.com 282 Second Street, 4th Floor San Francisco, CA 94105 415. 538.8555 ext. 284