At 12:47 AM -0800 3/20/00, Remi Fasol wrote:
>i suppose that's because the urls each have a
>different query string (session-id=123..) and htdig
>doesn't support cookies.
>
>is there a way to have htdig ignore these session-id
>query strings and just index each page once?

You should probably check the archives, the question about "can I 
strip off query strings" comes up periodically. There may even be a 
patch. But if not, you could hack htlib/URL.cc and have it ignore 
everything after the ?.

But this raises another question--why are you spitting back different 
session IDs? This summer I was doing similar things and we were very 
careful to leave the session ID alone. It would be quite confusing 
for a browser (much less htdig) to have the session change.

Also, you might consider adding code to ignore the session ID on the 
server-side if the User-Agent is htdig. Obviously if it's easier to 
change one part of the code or another, you go that way. ;-)

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Reply via email to