On Mon, May 04, 2009 at 04:30:15PM -0700, Ken Krugler wrote:
> Hi all,
>
> In Http 3.1, the Nutch code base would configure timeouts using the  
> following snippet of code:
>
>     MultiThreadedHttpConnectionManager connectionManager =
>           new MultiThreadedHttpConnectionManager();
>
>     HttpClient client = new HttpClient(connectionManager);
>
>     HttpConnectionManagerParams params = connectionManager.getParams();
>     params.setConnectionTimeout(timeout);
>     params.setSoTimeout(timeout);
>
>     // executeMethod(HttpMethod) seems to ignore the connection timeout 
> on the connection manager.
>     // set it explicitly on the HttpClient.
>
>     client.getParams().setConnectionManagerTimeout(timeout);
>
> What's the functional equivalent in 4.0? I'm assuming that:
>
>     HttpParams params = new BasicHttpParams();
>     ConnManagerParams.setTimeout(params, timeout);
>
> is equivalent to the 3.1 call to params.setConnectionTimeout(timeout). 
> But what about the setSoTimeout() call?
>

HTTP parameter:

'http.socket.timeout' sets the socket timeout
'http.connection.timeout' sets the connect timeout
'http.conn-manager.timeout' sets the connection request timeout

Corresponding utility methods:

HttpParams params = new BasicHttpParams(); 
HttpConnectionParams.setSoTimeout(params, 2000);
HttpConnectionParams.setConnectionTimeout(params, 2000);
ConnManagerParams.setTimeout(params, 2000);

The essentials bits about HTTP request execution and connection management are
documented here. 

http://wiki.apache.org/HttpComponents/HttpClientTutorial

The text has not been proof-read yet and is still very much work in progress,
but it may be a reasonable reference even in its present form.

> One reason I'm asking is that I ran into a very long timeout while  
> trying to fetch a page. The wire log looked like:
>
> 09/05/04 16:03:39 DEBUG client.DefaultRequestDirector:408 - Attempt 1 to 
> execute request
> 09/05/04 16:03:39 DEBUG http.wire:78 - >> "GET  
> /noticias/elrostrodeanaliacanal9-argentina-elrostrodeanalia-canal9/  
> HTTP/1.1[EOL]"
> 09/05/04 16:03:39 DEBUG http.wire:78 - >> "Host:  
> telenovelas.censuratv.net[EOL]"
> 09/05/04 16:03:39 DEBUG http.wire:78 - >> "Connection: Keep-Alive[EOL]"
> 09/05/04 16:03:39 DEBUG http.wire:78 - >> "User-Agent: bixo[EOL]"
> 09/05/04 16:03:39 DEBUG http.wire:78 - >> "[EOL]"
> 09/05/04 16:03:39 DEBUG http.headers:251 - >> GET  
> /noticias/elrostrodeanaliacanal9-argentina-elrostrodeanalia-canal9/  
> HTTP/1.1
> 09/05/04 16:03:39 DEBUG http.headers:254 - >> Host: telenovelas.censuratv.net
> 09/05/04 16:03:39 DEBUG http.headers:254 - >> Connection: Keep-Alive
> 09/05/04 16:03:39 DEBUG http.headers:254 - >> User-Agent: bixo
> 09/05/04 16:13:32 DEBUG conn.DefaultClientConnection:160 - Connection closed
> 09/05/04 16:13:32 DEBUG client.DefaultRequestDirector:414 - Closing the 
> connection.
> 09/05/04 16:13:32 DEBUG conn.DefaultClientConnection:160 - Connection closed
> 09/05/04 16:13:32 INFO client.DefaultRequestDirector:418 - I/O exception 
> (org.apache.http.NoHttpResponseException) caught when processing request: 
> The target server failed to respond
> 09/05/04 16:13:32 DEBUG client.DefaultRequestDirector:423 - The target 
> server failed to respond
> 09/05/04 16:13:32 INFO client.DefaultRequestDirector:425 - Retrying request
> 09/05/04 16:13:32 DEBUG client.DefaultRequestDirector:433 - Reopening  
> the direct connection.
> 09/05/04 16:14:24 DEBUG conn.DefaultClientConnection:147 - Connection shut 
> down
> 09/05/04 16:14:24 DEBUG tsccm.ThreadSafeClientConnManager:223 - Released 
> connection is not reusable.
> 09/05/04 16:14:24 DEBUG tsccm.ConnPoolByRoute:374 - Releasing connection 
> [HttpRoute[{}->http://telenovelas.censuratv.net]][null]
> 09/05/04 16:14:24 DEBUG tsccm.ConnPoolByRoute:631 - Notifying no-one,  
> there are no waiting threads
> 09/05/04 16:14:24 DEBUG http.HttpClientFetcher:267 - Exception while  
> fetching url  
> http://telenovelas.censuratv.net/noticias/elrostrodeanaliacanal9-argentina-elrostrodeanalia-canal9/
> java.net.UnknownHostException: telenovelas.censuratv.net
>
> So the first try failed after about 10 minutes with an I/O exception  
> (org.apache.http.NoHttpResponseException), then the retry failed much  
> faster (50 seconds) with an java.net.UnknownHostException.
>
> I'm guessing that maybe the real cause of the first long timeout was my 
> DNS system timing out while trying to resolve the invalid server  
> address, and then this "bad hostname" result was cached so that the  
> retry failed faster.
>

You can plug in a custom DNS name resolver using this interface:

http://hc.apache.org/httpcomponents-client/httpclient/apidocs/org/apache/http/conn/scheme/HostNameResolver.html



> But independent of the above, I'm interested in the best way to prevent 
> all cases of long timeouts, with 4.0.
>

I believe setting socket and connect timeouts to some reasonable value, say 30
seconds, should be sufficient.

Hope this helps

Oleg


> Thanks much!
>
> -- Ken
>
> PS - I could do my own DNS resolver that maps hostnames to IP addresses, 
> and wrap this with a timer to fail after 10 seconds or so.
> -- 
> Ken Krugler
> +1 530-210-6378
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to