According to Terry Poperszky: > On Tue, 2003-02-18 at 06:44, Terry Poperszky wrote: > > htdig-3.1.6 > > suse 8.1 > > > > I am having trouble with a site that is running frames and coldfusion, > > htdig only parses the top page. The problem doesn't appear to be soley > > frames since I can parse the http://www.htdig.org with no problem. Not > > sure what other information you might need.
... > The is the result running htdig -vvv > > 0:1:http://sosnet.sosstaffing.com/index.cfm > New server: sosnet.sosstaffing.com, 80 > Retrieval command for http://sosnet.sosstaffing.com/robots.txt: GET > /robots.txt HTTP/1.0^M > User-Agent: htdig/3.1.6 ([EMAIL PROTECTED])^M > Host: sosnet.sosstaffing.com^M > ^M > Header line: HTTP/1.1 404 Object Not Found > Header line: Server: Microsoft-IIS/5.0 > Header line: Date: Tue, 18 Feb 2003 14:00:09 GMT > Header line: Content-Length: 4040 > Header line: Content-Type: text/html > Header line: > returnStatus = 1 > pushed OK, and then what happens? All the above tells me is that htdig got a 404 error when attempting to fetch /robots.txt from your server. That's no big surprise, as the majority of web servers don't have one. I assume, though, that it then tries to fetch what would appear to be your start_url, shown as "http://sosnet.sosstaffing.com/index.cfm" in the output above. Either it's failing on that, or it gets it just fine, but rejects all other links because they don't match the start_url. Note that the default value of limit_urls_to is the same as the start_url. See http://www.htdig.org/attrs.html#limit_urls_to http://www.htdig.org/attrs.html#start_url and http://www.htdig.org/FAQ.html#q5.27 -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

