According to Wang, Mary Y: > This is my htdig.conf file > start_url: > http://prodbrass.web.boeing.com/mhonarchive/test-ulysses-final > limit_urls_to: ${start_url} > the others are the default stuff > but when I rundig -c /usr/htdig/htdig.conf -vvvv > I got the following... ... > 1:1:http://prodbrass.web.boeing.com/mhonarchive/test-ulysses-final > New server: prodbrass.web.boeing.com, 80 ... > > prodbrass.web.boeing.com with a traditional HTTP connection > 0:2:0:http://prodbrass.web.boeing.com/mhonarchive/test-ulysses-final: Making > HTTP request on > http://prodbrass.web.boeing.com/mhonarchive/test-ulysses-final > Header line: HTTP/1.1 301 Moved Permanently > Header line: Date: Thu, 08 Aug 2002 00:38:24 GMT > Header line: Server: Apache/1.3.20 (Unix) (Red-Hat/Linux) mod_python/2.7.6 > Python/1.5.2 mod_ssl/2.8.4 OpenSSL/0.9.6b DAV/1.0.2 PHP/4.2.2 > mod_perl/1.24_01 mod_throttle/3.1.2 > Header line: Location: > http://prodbrass.web.boeing.com/mhonarchive/test-ulysses-final/ ... > redirect: http://prodbrass.web.boeing.com/mhonarchive/test-ulysses-final/ > resolving 'http://prodbrass.web.boeing.com/mhonarchive/test-ulysses-final/' > pushing http://prodbrass.web.boeing.com/mhonarchive/test-ulysses-final/ > pick: prodbrass.web.boeing.com, # servers = 1 > > prodbrass.web.boeing.com with a traditional HTTP connection > 1:3:0:http://prodbrass.web.boeing.com/mhonarchive/test-ulysses-final/: > Making HTTP request on > http://prodbrass.web.boeing.com/mhonarchive/test-ulysses-final/ > Header line: HTTP/1.1 200 OK ... > Header line: Connection: close > Header line: Transfer-Encoding: chunked > Header line: Content-Type: text/html > No modification time returned: assuming now > Retrieving document /mhonarchive/test-ulysses-final/ on host: > prodbrass.web.boeing.com:80 > Http version : HTTP/1.1 > Server : HTTP/1.1 > Status Code : 200 > Reason : OK > Access Time : Thu, 08 Aug 2002 00:38:24 GMT > Modification Time : Thu, 08 Aug 2002 00:38:29 GMT > Content-type : text/html > Transfer-encoding : chunked > Connection : close > Request time: 5 secs > ----------------------------------------------------------------------- > What does it mean by "moved permanently" in the reason field? It actually > said pushing > http://prodbrass.web.boeing.com/mhonarchive/test-ulysses-final/, but > it didn't index any files under that directory.
The "Moved Permanently" message is the standard description of a 301 return code, which is caused by a redirect. It's nothing to worry about, as htdig does handle redirects. It's also standard procedure for a web server, when given a request for the URL of a directory that's missing the trailing slash, to give the client a redirect to the corrected URL with the trailing slash. This is to prevent the client from having problems interpreting links relative to that directory. You should have the trailing slash on your start_url to avoid the extra redirect. It's abscence shouldn't be a problem, but it does cause unnecessary extra traffic. Unfortunately, your output excerpt ends just when it gets interesting, right after htdig fetches the "text-ulysses-final/" directory listing. Was there any output after that? htdig should have parsed a bunch of hrefs at that point. I did notice that your server is using chunked encoding. I think there were problems with reading chunked input as recently as the Feb. 3/02 snapshot of 3.2.0b4. You didn't mention which one you're running, but if it's less recent than Feb. 10, you may want to upgrade. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

