Doug Cutting wrote:
>... protocol-http is capable of faster crawling than protocol-httpclient.
> So I don't think we should discard protocol-http just yet. 

>What do others think?

I think:

HttpClient-based [protocol-httpclient] uses own Threads. 
[protocol-http] does not create Threads.

We should manage this, [protocol-httpclient] is just temporary solution for
Cookies, Proxy, HTTPS etc.; [protocol-httpclient] still caches DNS-to-IP
mappings forever; Thread-related issues are very important...

Additionally, we should have such a setting:
"Wait 5 second between requests to SLOW servers"

- it means, that Nutch can dynamically define fast/slow servers and work
faster/slower...

Fuad

Reply via email to