[
https://issues.apache.org/jira/browse/NUTCH-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532026
]
Doğacan Güney commented on NUTCH-559:
-------------------------------------
I haven't tested it yet but after a quick review, latest patch looks good to
me. However, it would be nice if we can have some unit tests for the new
functionality.
> Extending the authentication to work for more than one host was in my mind
> but I found too many possible cases. So I was
> planning to have a different configuration file where all the authentication
> rules can be mentioned to override the corresponding
> 'conf/nutch-site.xml' properties. The different possible cases are: [...]
OK, a different configuration file sounds good (I don't like that we are
putting a file in conf/ for a plugin, but we already do that anyway. We should
probably prefix the name of the file with plugin's name to make it clear, like:
httpclient-auth.txt)
> I removed cookie related code earlier because I didn't find it to work (even
> before merging my work). However, I have brought
> them back in the revised patch. We can discuss more on this if required.
I think it should work. It doesn't remember cookies across different crawl
cycles but it should remember them during a single fetch.
> I have restored most of the original response reading code except for
> 'calculateTryToRead'. This method is not checking for
> 'Content-Length' limit. The content-length limit check present in this patch
> is similar to that of 'protocol-http' which is simpler
> and correct.
OK.
> NTLM, Basic and Digest Authentication schemes for web/proxy server
> ------------------------------------------------------------------
>
> Key: NUTCH-559
> URL: https://issues.apache.org/jira/browse/NUTCH-559
> Project: Nutch
> Issue Type: Improvement
> Components: fetcher
> Affects Versions: 1.0.0
> Reporter: Susam Pal
> Attachments: NUTCH-559v0.1.patch, NUTCH-559v0.2.patch
>
>
> Added basic, digest and NTLM authentication schemes to protocol-httpclient.
> The authentication schemes can be configured for proxy server as well as web
> servers of a domain. HTTP authentication can take place over HTTP/1.0,
> HTTP/1.1 and HTTPS.
> The authentication guide can be found here:
> [http://wiki.apache.org/nutch/HttpAuthenticationSchemes].
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.