On Mon, May 04, 2009 at 04:30:15PM -0700, Ken Krugler wrote: > Hi all, > > In Http 3.1, the Nutch code base would configure timeouts using the > following snippet of code: > > MultiThreadedHttpConnectionManager connectionManager = > new MultiThreadedHttpConnectionManager(); > > HttpClient client = new HttpClient(connectionManager); > > HttpConnectionManagerParams params = connectionManager.getParams(); > params.setConnectionTimeout(timeout); > params.setSoTimeout(timeout); > > // executeMethod(HttpMethod) seems to ignore the connection timeout > on the connection manager. > // set it explicitly on the HttpClient. > > client.getParams().setConnectionManagerTimeout(timeout); > > What's the functional equivalent in 4.0? I'm assuming that: > > HttpParams params = new BasicHttpParams(); > ConnManagerParams.setTimeout(params, timeout); > > is equivalent to the 3.1 call to params.setConnectionTimeout(timeout). > But what about the setSoTimeout() call? >
HTTP parameter: 'http.socket.timeout' sets the socket timeout 'http.connection.timeout' sets the connect timeout 'http.conn-manager.timeout' sets the connection request timeout Corresponding utility methods: HttpParams params = new BasicHttpParams(); HttpConnectionParams.setSoTimeout(params, 2000); HttpConnectionParams.setConnectionTimeout(params, 2000); ConnManagerParams.setTimeout(params, 2000); The essentials bits about HTTP request execution and connection management are documented here. http://wiki.apache.org/HttpComponents/HttpClientTutorial The text has not been proof-read yet and is still very much work in progress, but it may be a reasonable reference even in its present form. > One reason I'm asking is that I ran into a very long timeout while > trying to fetch a page. The wire log looked like: > > 09/05/04 16:03:39 DEBUG client.DefaultRequestDirector:408 - Attempt 1 to > execute request > 09/05/04 16:03:39 DEBUG http.wire:78 - >> "GET > /noticias/elrostrodeanaliacanal9-argentina-elrostrodeanalia-canal9/ > HTTP/1.1[EOL]" > 09/05/04 16:03:39 DEBUG http.wire:78 - >> "Host: > telenovelas.censuratv.net[EOL]" > 09/05/04 16:03:39 DEBUG http.wire:78 - >> "Connection: Keep-Alive[EOL]" > 09/05/04 16:03:39 DEBUG http.wire:78 - >> "User-Agent: bixo[EOL]" > 09/05/04 16:03:39 DEBUG http.wire:78 - >> "[EOL]" > 09/05/04 16:03:39 DEBUG http.headers:251 - >> GET > /noticias/elrostrodeanaliacanal9-argentina-elrostrodeanalia-canal9/ > HTTP/1.1 > 09/05/04 16:03:39 DEBUG http.headers:254 - >> Host: telenovelas.censuratv.net > 09/05/04 16:03:39 DEBUG http.headers:254 - >> Connection: Keep-Alive > 09/05/04 16:03:39 DEBUG http.headers:254 - >> User-Agent: bixo > 09/05/04 16:13:32 DEBUG conn.DefaultClientConnection:160 - Connection closed > 09/05/04 16:13:32 DEBUG client.DefaultRequestDirector:414 - Closing the > connection. > 09/05/04 16:13:32 DEBUG conn.DefaultClientConnection:160 - Connection closed > 09/05/04 16:13:32 INFO client.DefaultRequestDirector:418 - I/O exception > (org.apache.http.NoHttpResponseException) caught when processing request: > The target server failed to respond > 09/05/04 16:13:32 DEBUG client.DefaultRequestDirector:423 - The target > server failed to respond > 09/05/04 16:13:32 INFO client.DefaultRequestDirector:425 - Retrying request > 09/05/04 16:13:32 DEBUG client.DefaultRequestDirector:433 - Reopening > the direct connection. > 09/05/04 16:14:24 DEBUG conn.DefaultClientConnection:147 - Connection shut > down > 09/05/04 16:14:24 DEBUG tsccm.ThreadSafeClientConnManager:223 - Released > connection is not reusable. > 09/05/04 16:14:24 DEBUG tsccm.ConnPoolByRoute:374 - Releasing connection > [HttpRoute[{}->http://telenovelas.censuratv.net]][null] > 09/05/04 16:14:24 DEBUG tsccm.ConnPoolByRoute:631 - Notifying no-one, > there are no waiting threads > 09/05/04 16:14:24 DEBUG http.HttpClientFetcher:267 - Exception while > fetching url > http://telenovelas.censuratv.net/noticias/elrostrodeanaliacanal9-argentina-elrostrodeanalia-canal9/ > java.net.UnknownHostException: telenovelas.censuratv.net > > So the first try failed after about 10 minutes with an I/O exception > (org.apache.http.NoHttpResponseException), then the retry failed much > faster (50 seconds) with an java.net.UnknownHostException. > > I'm guessing that maybe the real cause of the first long timeout was my > DNS system timing out while trying to resolve the invalid server > address, and then this "bad hostname" result was cached so that the > retry failed faster. > You can plug in a custom DNS name resolver using this interface: http://hc.apache.org/httpcomponents-client/httpclient/apidocs/org/apache/http/conn/scheme/HostNameResolver.html > But independent of the above, I'm interested in the best way to prevent > all cases of long timeouts, with 4.0. > I believe setting socket and connect timeouts to some reasonable value, say 30 seconds, should be sufficient. Hope this helps Oleg > Thanks much! > > -- Ken > > PS - I could do my own DNS resolver that maps hostnames to IP addresses, > and wrap this with a timer to fail after 10 seconds or so. > -- > Ken Krugler > +1 530-210-6378 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
