Doug Cutting wrote: >... protocol-http is capable of faster crawling than protocol-httpclient. > So I don't think we should discard protocol-http just yet.
>What do others think? I think: HttpClient-based [protocol-httpclient] uses own Threads. [protocol-http] does not create Threads. We should manage this, [protocol-httpclient] is just temporary solution for Cookies, Proxy, HTTPS etc.; [protocol-httpclient] still caches DNS-to-IP mappings forever; Thread-related issues are very important... Additionally, we should have such a setting: "Wait 5 second between requests to SLOW servers" - it means, that Nutch can dynamically define fast/slow servers and work faster/slower... Fuad ------------------------------------------------------- SF.Net email is sponsored by: Tame your development challenges with Apache's Geronimo App Server. Download it for free - -and be entered to win a 42" plasma tv or your very own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers