> One rather easy way to accomplish this would be to use the 
> search_rewrite_rules property in htdig.conf, replacing every pageN.html 
> with index.html in the search output before delivering it. However, 
> this might result in several matches in the search result that all 
> point to the same URL.

Yes, exactly.

> The other idea I had was to use the url_rewrite_rules property to 
> rewrite the URLs before entering them into the database. Would this 
> work and index all pages under one URL? Or would only one of the pages 
> (first/last/newest?) be indexed under this URL, while the others would 
> be ignored?

What it would do would be to ignore anything except the URL coming out 
from your rewrite rules. So the other pages would not be indexed at all. 
The rewriting is done before the URL is added to the list to index.

(I had to check the code to be sure of that. See 
htdig/Retriever.cc::got_href if you're curious.)

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to