Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by susam:
http://wiki.apache.org/nutch/HttpAuthenticationSchemes

The comment on the change is:
changed introduction

------------------------------------------------------------------------------
  == Introduction ==
- 'protocol-httpclient' is a protocol plugin which supports retrieving 
documents via the HTTP 1.0, HTTP 1.1 and HTTPS protocols, optionally with 
Basic, Digest and NTLM authentication schemes for web server as well as proxy 
server. This feature can not do POST based authentication that depends on 
cookies. More information on this can be found at: HttpPostAuthentication
+ This is a feature in Nutch, developed by Susam Pal, that allows the crawler 
to authenticate itself to websites requiring NTLM, Basic or Digest 
authentication. This feature can not do POST based authentication that depends 
on cookies. More information on this can be found at: HttpPostAuthentication
  
  == Necessity ==
  There were two plugins already present, viz. 'protocol-http' and 
'protocol-httpclient'. However, 'protocol-http' could not support HTTP 1.1, 
HTTPS and NTLM, Basic and Digest authentication schemes. 'protocol-httpclient' 
supported HTTPS and had code for NTLM authentication but the NTLM 
authentication didn't work due to a bug. Some portions of 'protocol-httpclient' 
were re-written to solve these problems, provide additional features like 
authentication support for proxy server and better inline documentation for the 
properties to be used to configure authentication.

Reply via email to