Hello all,

I am trying to index a site that has a couple of pages being redirected and
am trying to preserve the address that htdig sees and not the one the
redirectservlet returns.

My start_url is http://mydomain.com/index.jsp?pageid=4006.

This page has a list of links that have addresses like
http://mydomain.com/redirectservlet/index.jsp/page_id/4006/Topic1
http://mydomain.com/redirectservlet/index.jsp/page_id/4006/Topic2
http://mydomain.com/redirectservlet/index.jsp/page_id/4006/Topic3

When one clicks on the above address and the page loads, the address reverts
to http://mydomain.com/index.jsp?pageid=4006.

So when I index the site, all my links are still
http://mydomain.com/index.jsp?pageid=4006

I tried using url_rewrite_rules to get around this and keep the addresses
that htdig sees:

url_rewrite_rules:      (.*) \\0

I have also tried combinations thereof, e.g:
url_rewrite_rules:      .* \\0
url_rewrite_rules:      (.*) \\1
etc.

Nothing seems to work.  Granted regular expressions are new to me, but it
seems like the above pair should work.  I was able to dump out an url list
with the url_list attribute and everything seems fine.  But when I do a
search, all the topics have the start_url address.  Can anyone help?  I'm in
a really pinch.

Many thanks,

Soriana Villanueva



_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to