> According to Katherine Porter: 
> > Geoff - actually, I did turn of persistent_connections and ignore as well 
> > as set the "head_before_get" setting, and only the head_before_get: true 
> > will result in a good crawl. 
> > 
> > > > We have an old web server here that's running "CERN/TSX-32 WWW server 
> > > > version 3.0". The web server only supports HTTP/1.0. Unless I turn 
> > > 
> > > That's OK. Plenty of servers still only support HTTP/1.0. 
> > > 
> > > > on "head_before_get" in my configuration file, it won't even attempt 
> > > > to pull down a single URL from the server. 
> > > 
> > > Try turning off persistent_connections and ignore the head_before_get 
> > > setting: 
> > > <http://www.htdig.org/dev/htdig-3.2/attrs.html#persistent_connections> 
> > > 
> > > My guess is that the HTTP/1.1 persistent connection code isn't properly 
> > > downgrading the connection for the HTTP/1.0 server. 
> 
> Hmm. I'd be interested in knowing if 3.1.5, or the latest 3.1.6 snapshot, 
> has any problems with this same web server. If so, I'd suspect a problem 
> with the server itself. If not, then it's likely a bug in the 3.2 
> HTTP code. 3.1.x doesn't support HTTP/1.1 or persistent connections, 
> but it doesn't do HEAD requests before GET requests either. If your 
> server needs HEAD requests, I'd think that's a bug. On the other hand, 
> more likely what's happening is a side-effect in the head_before_get 
> implementation is sidestepping a bug in the 3.2 HTTP code. 

Gilles - 3.1.5 has no problems with it at all.  The web site is www.dol.net, if you'd 
like to test it yourself.

[.kate]
___________________________________________________________________________
Visit http://www.visto.com.
Find out  how companies are linking mobile users to the 
enterprise with Visto.


_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to