Thanks Jim, I had a feeling htdig was doing this for a good reason, but it's good to know for sure. I think I'll change my cgi's to place a dummy value in the url, instead of hacking URL.cc...this will also conform the url to standards.
Thanks again Craig On Mon, 2003-07-07 at 23:06, Jim Cole wrote: > On Monday, July 7, 2003, at 03:52 PM, Craig Taylor wrote: > > > I have urls one my site that are in the format: > > http://www.whatever.com/test.cgi//88 > > > > When I run htdig and htmerge I get search results with the url above > > changed to: http://www.whatever.com/test.cgi/88 missing the second > > forward slash. > > The problem is that the URLs you are using are not valid. Although they > may work fine in other situations, they do not comply with the RFC that > defines URLs. You are trying to pass a '/' as ordinary data while the > RFC defines the '/' as a reserved character. At least that is my > reading. Since consecutive '/' characters are not allowed for, htdig > collapses them into a single '/'. The reason that htdig goes out of its > way to change URLs in this fashion is that they have the potential to > create loops as htdig spiders through a site. > > My guess is that you are not going to find an easy workaround. > Normalization of the URL path occurs very early in the process, during > the initial parse of the URL. If you want to look at the code that does > the normalization, and perhaps hack it to your needs, look for the > normalizePath() method in URL.cc. > > Jim > -- Politics is supposed to be the second-oldest profession. I have come to realize that it bears a very close resemblance to the first. --Ronald Reagan ------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100006ave/direct;at.asp_061203_01/01 _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

