According to Dan Langille: > On 1 Oct 2001 at 17:53, Geoff Hutchison wrote: > > On Mon, 1 Oct 2001, Dan Langille wrote: > > > redirect: http://www.unixathome.org/adsl/archives/2001_06/ > > > > > > Rejected: URL not in the limits! > > > > Right. This is what I suspected. In your config file, the limit_urls_to > > attribute is restricting the indexing from looking at these URLs. So it > > would help if you could post from your configuration things like: > > > > limit_urls_to: > > exclude_urls: > > max_hop_count: > > limit_urls_to: ${start_url} > exclude_urls: /cgi-bin/ .cgi /phorum/ > max_hop_count: <== not found in config file.
That's the problem. Your start_url is something like http://unixathome.org/ but the redirect gives http://www.unixathome.org/adsl/..., which doesn't match the pattern in limit_urls_to as it has simply taken on the value of start_url. You should probably set the following in your htdig.conf: limit_urls_to: http://unixathome.org/ http://www.unixathome.org/ server_aliases: www.unixathome.org:80=unixathome.org:80 The limit_urls_to will allow URLs with or without the "www.", and the server_aliases will strip off the "www." to avoid getting duplicates in the database, with and without the "www." prefix. If you prefer, you couls also set limit_urls_to as... limit_urls_to: ${start_url} http://www.unixathome.org/ so any subsequent additions to start_url won't be excluded by limit_urls_to. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

