>> Jonathan Knoll:
>> >> User-agent: *
>> >> Disallow: /cgi-bin
>> >> Disallow: /site
>>
>> Klaus Johannes Rusch:
>> > /cgi-bin/test.cgi
>> > /siteindex.html
>> > would be excluded.
>>
(Me:)
>> But what about these paths (in the same root dir):
>>
>>    /foo/cgi-bin/test.cgi
>>    /bar/user1/cgi-bin/test.sgi
>>    /bar/user2/cgi-bin/test.cgi
>>
>> Does the wildcard function recognize specified strings elsewhere (later)
>> than in the immediate beginning of a path?
>
>Martin Beet:
>The draft specification is quite clear on this: the strings are compared
>octet by octet until the Allow / Disallow string ends, in which case this
>rule matches, or until a mismatch is found. From the spec:
>
>" The matching process compares every octet in the path portion of
>   the URL and the path from the record. [...]  The match
>   evaluates positively if and only if the end of the path from the
>   record is reached before a difference in octets is encountered."

Thanks, Martin!

To briefly paraphrase this:
A robot never traverses the URL beyond the lenght of the Disallow line. Thus
a Disallow string cannot function as a *free* wildcard element ("Disallow:
/foo" would apply to "/foo/bar" but not to "/bar/foo").

Regards, Tuomas

Reply via email to