Hi,
I would like to code an high performance web crawler using httpclient 4.1.2.
In order to bring the machine to highest throughput, each crawling thread
creating a DefaultHttpClient with a pool configured as follow (based on one
of the examples):
static
{
cm = new ThreadSafeClientConnManager();
cm.setMaxTotal( 50000 );
cm.setDefaultMaxPerRoute( Integer.MAX_VALUE );
HttpClient client = new DefaultHttpClient();
params = client.getParams();
HttpClientParams.setRedirecting( params, false );
HttpClientParams.setAuthenticating( params, true );
HttpConnectionParams.setSoTimeout( params, 30000 );
HttpConnectionParams.setConnectionTimeout( params, 30000 );
IdleConnectionEvictor connEvictor = new IdleConnectionEvictor(
cm );
connEvictor.start();
}
When running the application with lots of crawling threads, netstat show
only 2k tcp connections in status ESTABLISHED. Is this expected considering
maxTotsl = 50000? Are there other bottlenecks (OS level, etc.) blocking the
application to reach more than 2k tcp connections?
Thanks.
--
View this message in context:
http://old.nabble.com/Understanding-how-ThreadSafeClientConnManager-parameters-affect-number-of-tcp-connections-tp33190497p33190497.html
Sent from the HttpClient-User mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]