I think you've answered me already, but this question was suppose to 
be directed at excluding all engines except htdig (?).

There is no other way (the last response threw me a bit) to do that 
except the robots.txt -is that correct?

TR

At 12:55 PM -0600 12/6/01, Gilles Detillieux wrote:
>According to Malcolm Austen:
>>  On Sun, 2 Dec 2001 [EMAIL PROTECTED] wrote:
>>  + >No, it goes in the root of the server, e.g.:
>>  + >
>>  + >http://www.foo.com/robots.txt
>>  +
>>  + what i should have said is:
>>  +
>>  + the urls to the students' sites are formed, e.g.,:
>>  + http://slis.lis.sco.edu/~H765-87
>>  +
>>  + the actual paths on the server to their home dirs is:
>>  + /usr2/foo/foo/foo/~H765-87/public_html/index.html
>>  +
>>  + and each student is incremented, e.g., ~H765-87, -88, -89....
>>  +
>>  + can i put a robots.txt file in each ~homdir or in each public_html dir?
>>
>>  No. The original answer above was exact and correct. It must go in the
>>  _server_ document root directory. There can only be one robots.txt on the
>>  server and it _must_ be addressable as http://slis.lis.sco.edu/robots.txt
>
>When you don't have access to the server's DocumentRoot directory, you
>have to find some other means of excluding documents, such as the
>exclude_urls attribute, or meta robots tags right in the HTML files.

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to