Hi - The problem is with the site you are trying to index. The
site is organized such that the search.shtml file is accessible
through a number of distinct URL's, limited only by the length
that a given piece of software/hardware can handle. My first
guess is that this is due to the use of one or more symbolic
links (or some other type of redirect).

Using a browser, you should find that

http://www.mrs.umn.edu/services/grants/grants/search.shtml
http://www.mrs.umn.edu/services/grants/grants/search/search.shtml
http://www.mrs.umn.edu/services/grants/grants/search/search/search.shtml
http://www.mrs.umn.edu/services/grants/grants/search/search/.../search.shtml

all lead to the same page. In fact, if you just go to

http://www.mrs.umn.edu/services/grants/grants/ and repeatedly
click on the "Our search page" link, you will see that with each
click the URL changes to include another 'search'.

If you can't fix the site (or have it fixed), you might want to
try using the exclude_urls attribute to limit what htdig tries
to index.

http://www.htdig.org/attrs.html#exclude_urls


Jim

Mark Van Overbeke's bits of Tue, 30 Oct 2001 translated to:

>I keep getting this error from my htdig runs:
>
>Not found:
>http://www.mrs.umn.edu/services/grants/grants/search/search/search/search/search/search/search/search/search/search/search/search/search/search/search/search/search/search/search/
[snip]
>
>earch/search/search/search/search/search/search.shtml Ref:


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to