Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by susam: http://wiki.apache.org/nutch/protocol-http11 The comment on the change is: why the name is protocol-http11 ------------------------------------------------------------------------------ Susam Pal, Infosys Technologies Limited == Necessity == - There were two plugins already present, viz. 'protocol-http' and 'protocol-httpclient'. However, 'protocol-http' could not support HTTP 1.1, HTTPS and NTLM, Basic and Digest authentication schemes. 'protocol-httpclient' supported HTTPS and had code for NTLM authentication but the NTLM authentication didn't work due to a bug. 'protocol-http11' was written to solve these problems, provide additional features like authentication support for proxy server and better inline documentation for the properties to be used in 'nutch-site.xml' to enable 'protocol-http11' and use its authentication features. This is an improvement on the previous two plugins. The author of this plugin has tested it in Infosys Technologies Limited by crawling the corporate intranet requiring NTLM authentication and this has been found to work well. + There were two plugins already present, viz. 'protocol-http' and 'protocol-httpclient'. However, 'protocol-http' could not support HTTP 1.1, HTTPS and NTLM, Basic and Digest authentication schemes. 'protocol-httpclient' supported HTTPS and had code for NTLM authentication but the NTLM authentication didn't work due to a bug. 'protocol-http11' was written to solve these problems, provide additional features like authentication support for proxy server and better inline documentation for the properties to be used in 'nutch-site.xml' to enable 'protocol-http11' and use its authentication features. This is an improvement on the previous two plugins. The author of this plugin has tested it in Infosys Technologies Limited by crawling the corporate intranet requiring NTLM authentication and this has been found to work well. The name, 'protocol-http11' was chosen because, 'HTTP 1.1' is a valid protocol name. == Download == Currently, this plugin is in the form of patch in JIRA. Download the patch from [https://issues.apache.org/jira/browse/NUTCH-557 JIRA NUTCH-557] and apply it to trunk.