On Sun, 28 Mar 1999, p0222 wrote:

> How can I tell htdig to *ignore* the robots.txt-files, on the whole web or
> on specified servers ?
> That's my problem:
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^EXLCUDE LIST ?!?
> How can i turn this exlcude list *OFF* ?!?

No, not quite. First off, you cannot turn off the robots.txt parsing. It's
a standard and if you have a problem with a server's robots.txt file, you
should really take it up with the webmaster.

That's not your problem. The default config file ships with the option:
exclude_urls: cgi-bin .cgi

So this option is excluding the option you mention. If you don't want
this, remove it. (One caveat... Currently, if you make exclude_urls empty,
it will ignore *all* URLs. So instead, set it to something that cannot
occur, like !-no-url-! and it won't exclude anything on the servers it
indexes.)

-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to