[
https://issues.apache.org/jira/browse/NUTCH-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529522
]
Andrzej Bialecki commented on NUTCH-557:
-----------------------------------------
I agree with Dogacan - I don't see why this plugin shouldn't be turned into a
patch for protocol-httpclient, simply adding the options that you added to your
plugin. Other than these options these two plugins are identical.
Regarding the benefits of using http/1.1: the main difference, from the Nutch
point of view, would be the support for keep-alives, i.e. the ability to send
multiple requests over the same TCP connection. However, in practice this
functionality is only rarely useful in our case, because it requires making
many requests to the same host - whereas Nutch shuffles the hosts in order to
provide a higher throughput and at the same time maintain the politeness
settings. This means that with a large fetchlist containing many hosts,
consecutive requests almost never go to the same host. This in turn means that
in order to benefit from keep-alives we would have to keep around massive
numbers of open connections (infeasible), or we have to drop connections
between requests ... which is what http/1.0 does :)
> protocol-http11 for HTTP 1.1, HTTPS, NTLM, Basic and Digest Authentication
> --------------------------------------------------------------------------
>
> Key: NUTCH-557
> URL: https://issues.apache.org/jira/browse/NUTCH-557
> Project: Nutch
> Issue Type: Improvement
> Components: fetcher
> Affects Versions: 1.0.0
> Reporter: Susam Pal
> Priority: Minor
> Attachments: protocol-http11v0.1.patch
>
>
> 'protocol-http11' is a protocol plugin which supports retrieving documents
> via the HTTP 1.0, HTTP 1.1 and HTTPS protocols, optionally with Basic, Digest
> and NTLM authentication schemes for web server as well as proxy server.
> The user guide and other information can be found here:-
> [http://wiki.apache.org/nutch/protocol-http11]
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.