[ 
https://issues.apache.org/jira/browse/NUTCH-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532026
 ] 

Doğacan Güney commented on NUTCH-559:
-------------------------------------

I haven't tested it yet but after a quick review, latest patch looks good to 
me. However, it would be nice if we can have some unit tests for the new 
functionality.

> Extending the authentication to work for more than one host was in my mind 
> but I found too many possible cases. So I was 
> planning to have a different configuration file where all the authentication 
> rules can be mentioned to override the corresponding 
> 'conf/nutch-site.xml' properties. The different possible cases are: [...]

OK, a different configuration file sounds good (I don't like that we are 
putting a file in conf/ for a plugin, but we already do that anyway. We should 
probably prefix the name of the file with plugin's name to make it clear, like: 
httpclient-auth.txt)

> I removed cookie related code earlier because I didn't find it to work (even 
> before merging my work). However, I have brought
> them back in the revised patch. We can discuss more on this if required.

I think it should work. It doesn't remember cookies across different crawl 
cycles but it should remember them during a single fetch.

> I have restored most of the original response reading code except for 
> 'calculateTryToRead'. This method is not checking for 
> 'Content-Length' limit. The content-length limit check present in this patch 
> is similar to that of 'protocol-http' which is simpler 
> and correct.

OK.



> NTLM, Basic and Digest Authentication schemes for web/proxy server
> ------------------------------------------------------------------
>
>                 Key: NUTCH-559
>                 URL: https://issues.apache.org/jira/browse/NUTCH-559
>             Project: Nutch
>          Issue Type: Improvement
>          Components: fetcher
>    Affects Versions: 1.0.0
>            Reporter: Susam Pal
>         Attachments: NUTCH-559v0.1.patch, NUTCH-559v0.2.patch
>
>
> Added basic, digest and NTLM authentication schemes to protocol-httpclient. 
> The authentication schemes can be configured for proxy server as well as web 
> servers of a domain. HTTP authentication can take place over HTTP/1.0, 
> HTTP/1.1 and HTTPS.
> The authentication guide can be found here: 
> [http://wiki.apache.org/nutch/HttpAuthenticationSchemes].

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to