Re: http keep alive

2009-10-14 Thread Andrzej Bialecki

Marko Bauhardt wrote:

hi.
is there a way for using http-keep-alive with nutch?
supports protocol-http or protocol-httpclient keep alive?

i cant find the using of http-keep-alive inside the code or in 
configuration files?


protocol-httpclient can support keep-alive. However, I think that it 
won't help you much. Please consider that Fetcher needs to wait some 
time between requests, and in the meantime it will issue requests to 
other sites. This means that if you want to use keep-alive connections 
then the number of open connections will climb up quickly, depending on 
the number of unique sites on your fetchlist, until you run out of 
available sockets. On the other hand, if the number of unique sites is 
small, then most of the time the Fetcher will wait anyway, so the 
benefit from keep-alives (for you as a client) will be small - though 
there will be still some benefit for the server side.




--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



RE: http keep alive

2009-10-14 Thread Fuad Efendi
I'd like to add:

Keep-Alive is not polite. It uses dedicated listener on server-side. 
Establishing TCP socket via specific IP handshake takes time, that's why 
KeepAlive exists for web servers - to improve performance of subsequent 
requests. However, it allocated dedicated listener for specific IP port / 
remote client... 

What will happen with classic setting of 150 processes in HTTPD 1.3 in case of 
150 robots trying to use Keep-Alive feature?

==
http://www.linkedin.com/in/liferay


 
 protocol-httpclient can support keep-alive. However, I think that it
 won't help you much. Please consider that Fetcher needs to wait some
 time between requests, and in the meantime it will issue requests to
 other sites. This means that if you want to use keep-alive connections
 then the number of open connections will climb up quickly, depending on
 the number of unique sites on your fetchlist, until you run out of
 available sockets. On the other hand, if the number of unique sites is
 small, then most of the time the Fetcher will wait anyway, so the
 benefit from keep-alives (for you as a client) will be small - though
 there will be still some benefit for the server side.
 
 
 
 --
 Best regards,
 Andrzej Bialecki 
   ___. ___ ___ ___ _ _   __
 [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
 ___|||__||  \|  ||  |  Embedded Unix, System Integration
 http://www.sigram.com  Contact: info at sigram dot com