According to Terry Poperszky:
> On Tue, 2003-02-18 at 06:44, Terry Poperszky wrote:
> > htdig-3.1.6
> > suse 8.1
> > 
> > I am having trouble with a site that is running frames and coldfusion,
> > htdig only parses the top page. The problem doesn't appear to be soley
> > frames since I can parse the http://www.htdig.org with no problem. Not
> > sure what other information you might need.

...
> The is the result running htdig -vvv
> 
>         0:1:http://sosnet.sosstaffing.com/index.cfm
> New server: sosnet.sosstaffing.com, 80
> Retrieval command for http://sosnet.sosstaffing.com/robots.txt: GET
> /robots.txt HTTP/1.0^M
> User-Agent: htdig/3.1.6 ([EMAIL PROTECTED])^M
> Host: sosnet.sosstaffing.com^M
> ^M
> Header line: HTTP/1.1 404 Object Not Found
> Header line: Server: Microsoft-IIS/5.0
> Header line: Date: Tue, 18 Feb 2003 14:00:09 GMT
> Header line: Content-Length: 4040
> Header line: Content-Type: text/html
> Header line:
> returnStatus = 1
>  pushed

OK, and then what happens?  All the above tells me is that htdig got
a 404 error when attempting to fetch /robots.txt from your server.
That's no big surprise, as the majority of web servers don't have one.

I assume, though, that it then tries to fetch what would appear to be
your start_url, shown as "http://sosnet.sosstaffing.com/index.cfm"; in
the output above.  Either it's failing on that, or it gets it just fine,
but rejects all other links because they don't match the start_url.
Note that the default value of limit_urls_to is the same as the start_url.

See http://www.htdig.org/attrs.html#limit_urls_to
    http://www.htdig.org/attrs.html#start_url
and http://www.htdig.org/FAQ.html#q5.27

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to