On Thu, 5 Oct 2000, GYGAX,OTTO (HP-Corvallis,ex1) wrote:
> My limit_urls_to key is set as you have it below (default).
> My start_url is currently set to a list of urls such as http://<server>/,
> http://<server>/arch.html, http://<sever>/dir1, http://<server>/dir2,
> http://<server>/dir3, ... where arch.html is a simple web page with a href
> pointer to http://<server>/~arch, the cover page to the Mhonarc mailing tree
> that contains links to every single mailing archive page.
OK, but then ~arch won't fall into the limits as you've set them (since
it's not any of the patterns in start_url). If you want to index all
documents on the server, you may want a more liberal limit_urls_to
directive, e.g.
limit_urls_to: http://<server>/
> Before I extended the start_url key attr., I only had http://<server>/ and
> http://<server>/arch.html, but htdig went as far as the few links off the
> server's index.html file, missing all other directories at the root. At one
OK, that was one of my points--it will follow the links it sees. So if you
index starting with http://server/ then it will follow links from
index.html. Unless you add those directories (as you did) to start_url, it
won't even know they're there.
--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
FAQ: <http://www.htdig.org/FAQ.html>