On Mon, May 04, 2009 at 04:30:15PM -0700, Ken Krugler wrote:
 Hi all,

In Http 3.1, the Nutch code base would configure timeouts using the following snippet of code:

     MultiThreadedHttpConnectionManager connectionManager =
           new MultiThreadedHttpConnectionManager();

     HttpClient client = new HttpClient(connectionManager);

 >     HttpConnectionManagerParams params = connectionManager.getParams();
 >     params.setConnectionTimeout(timeout);
     params.setSoTimeout(timeout);

     // executeMethod(HttpMethod) seems to ignore the connection timeout
 on the connection manager.
     // set it explicitly on the HttpClient.

     client.getParams().setConnectionManagerTimeout(timeout);

 What's the functional equivalent in 4.0? I'm assuming that:

     HttpParams params = new BasicHttpParams();
     ConnManagerParams.setTimeout(params, timeout);

 is equivalent to the 3.1 call to params.setConnectionTimeout(timeout).
 But what about the setSoTimeout() call?


HTTP parameter:

'http.socket.timeout' sets the socket timeout
'http.connection.timeout' sets the connect timeout
'http.conn-manager.timeout' sets the connection request timeout

Corresponding utility methods:

HttpParams params = new BasicHttpParams();
HttpConnectionParams.setSoTimeout(params, 2000);
HttpConnectionParams.setConnectionTimeout(params, 2000);
ConnManagerParams.setTimeout(params, 2000);

Good stuff, thanks.

One question - why is the timeout for ConnManagerParams.setTimeout a long, while the others are ints? Is this value expected to be really big at times?

[snip]

 > But independent of the above, I'm interested in the best way to prevent
 all cases of long timeouts, with 4.0.


I believe setting socket and connect timeouts to some reasonable value, say 30
seconds, should be sufficient.

OK, thanks.

I'll be doing a bigger crawl soon (> 1M pages) so I'll have more data on problems encountered.

-- Ken
--
Ken Krugler
+1 530-210-6378

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to