I tried the htdig-3.2.0b4-20020728 snapshot and I am still having redirect problems and trouble accessing some PDFs.
First issue, I thought there was a problem getting files with applets. What I found out is that the web page it is trying to dig has so many URLs in it that it can't get through all of them in 30 seconds, if I take out half of the URLS, then it can. This page is the main page of our web site, and really needs to be indexed, is there someway I can extend the time-out value? My guess is that the PDF problem is related to the time-out problem above, because I when I run htdig -vvvvvvv, I can see it retrieving the raw data from the PDF file, but then the connection is dropped. For the redirect problem, running htdig -vvvvvvv, here are parts of the log. https://myhost.com/csosbase gets redirected to https://myhost.com/csosbase/, but then it tries to get https://myhost.com/csosbase again, instead of the redirected page. 1:1:https://myhost.com/csosbase pushed 1:3:0:https://myhost.com/csosbase: Making HTTPS request on https://myhost.com/csosbase Try to get through to host myhost.com (port 443) 2 - Connection already open. No need to re-open. Connecting via TCP to (myhost.com:443) Taking advantage of persistent connections Request GET /csosbase HTTP/1.1^M Host: myhost.com^M User-Agent: htdig^M Authorization: Basic d2ViYWRtaW46dHUzNWRheQ==^M ^M Header line: HTTP/1.1 302 Moved Temporarily Header line: Server: Netscape-Enterprise/6.0 Header line: Date: Thu, 01 Aug 2002 20:28:24 GMT Header line: Location: https://myhost.com/csosbase/ Header line: Content-length: 0 Header line: Content-type: text/html No modification time returned: assuming now Retrieving document /csosbase on host: myhost.com:443 Http version : HTTP/1.1 Server : HTTP/1.1 Status Code : 302 Reason : Moved Temporarily Access Time : Thu, 01 Aug 2002 20:28:24 PST Modification Time : Thu, 01 Aug 2002 20:27:12 PST Content-type : text/html Persistent connection: would be accepted Body not retrieved Connection stays up ... (Persistent connection) Request time: 0 secs Contents: Content Type: text/html Content Length: 0 Modification Time: 2002-08-01 20:27:12 PST redirect redirect: https://myhost.com/csosbase resolving 'https://myhost.com/csosbase' pick: myhost.com, # servers = 1 > myhost.com supports HTTP persistent connections (infinite) htdig: Run complete htdig: 1 server seen: htdig: myhost.com:443 2 documents Gilles Detillieux wrote: > According to Rob Kremer: > >>I am seeing three problems when digging a SSL enabled web server. >> >>1. It has a lot of trouble with SOME pdf files, "connection down" message. It >>doesn't seem to be a problem with the size of the PDF, I can get it to dig a >>43Mb file, but then it can't dig a 127Kb file. It isn't consistent, about once >>out of 6 tries I can get it to dig the file. >>2. Unless a URL to a directory has a '/' after it, it will say it is >>redirecting, but will not dig the redirected page. >>3. Trouble indexing files with applets, "connection down" message. Running >>rundig -vvvvv -s, I can see that it is receiving data, but then it stops, as if >>the connection is broken. >> >>All of these work when digging a non-SSL enabled web server on the same system, >>using the same htdig.conf file, only changing http to https. >> >>This is htdig-3.2.0b4-20020721, OpenSSL-0.9.6d, Solaris 8, web server is iPlanet >>6.0 SP2. I am running this from a remote server. >> > > There were problems with the last several snapshots, up to the one > of July 21. It turns out it was grabbing an old branch of the tree, > without any updates since late January. Also, there was a small fix to > the SSL code last Saturday. See if the htdig-3.2.0b4-20020728 snapshot > doesn't fix most or all of these problems. > > If the problem with redirects persists, try running with more than one > -v, to see what URL htdig gets from the redirect. > > -- Rob Kremer JPL Cassini SA 818-393-1283 Fax: 393-4658 Office 230-311 M/S 230-310 -- ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

