At 11:12 AM +0200 9/10/01, Felix Kronlage wrote:
>at one of our sites, we have the problem that htdig get's stuck
>in a loop, sending queries to the same URL over and over, but appends
>more and more '?' to the query-string:
Now that's a new one. My guess is that you're using some form of
server-parsed HTML or SHTML, correct? (If not, then what server are
you using?)
That particular page didn't have any strange URLs on it, but my guess
is that another page deeper down inserts a "?" into a URL pointing
back to this page. This is a problem since all relative URLs on the
page will look "different" and will be reindexed.
I can't think of a legitimate URL with multiple "?" in a query string
like this, so it's probably something the URL parser should ignore.
In the meantime, there's an easy workaround.
exclude_urls: ??
(or add "??" to your normal exclude_urls patterns)
Regards, and thanks for the report,
--
--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html