[ 
https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466764#comment-13466764
 ] 

Max Dzyuba commented on NUTCH-827:
----------------------------------

Now I get the following error:

2012-10-01 14:40:54,996 ERROR httpclient.Http - Cookie-based authentication 
failed; cookies will not be present for this request but an attempt to retrieve 
them will be made for the next one.
java.lang.IllegalArgumentException: Entity enclosing requests cannot be 
redirected without user intervention
        at 
org.apache.commons.httpclient.methods.EntityEnclosingMethod.setFollowRedirects(EntityEnclosingMethod.java:225)
        at 
org.apache.nutch.protocol.httpclient.HttpCookieAuthentication.<init>(HttpCookieAuthentication.java:73)
        at 
org.apache.nutch.protocol.httpclient.Http.resolveCookieCredentials(Http.java:402)
        at 
org.apache.nutch.protocol.httpclient.Http.resolveCredentials(Http.java:387)
        at org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:152)
        at 
org.apache.nutch.protocol.http.api.RobotRulesParser.getRobotRulesSet(RobotRulesParser.java:440)
        at 
org.apache.nutch.protocol.http.api.RobotRulesParser.getRobotRulesSet(RobotRulesParser.java:425)
        at 
org.apache.nutch.protocol.http.api.HttpBase.getRobotRules(HttpBase.java:403)
        at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:668)

Sorry to bug you about this...


Thanks for your time!
Max
                
> HTTP POST Authentication
> ------------------------
>
>                 Key: NUTCH-827
>                 URL: https://issues.apache.org/jira/browse/NUTCH-827
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher
>    Affects Versions: 1.1, nutchgora
>            Reporter: Jasper van Veghel
>            Priority: Minor
>              Labels: authentication
>             Fix For: 1.6
>
>         Attachments: nutch-http-cookies.patch
>
>
> I've created a patch against the trunk which adds support for very 
> rudimentary POST-based authentication support. It takes a link from 
> nutch-site.xml with a site to POST to and its respective parameters 
> (username, password, etc.). It then checks upon every request whether any 
> cookies have been initialized, and if none have, it fetches them from the 
> given link.
> This isn't perfect but Works For Me (TM) as I generally only need to retrieve 
> results from a single domain and so have no cookie overlap (i.e. if the 
> domain cookies expire, all cookies disappear from the HttpClient and I can 
> simply re-fetch them). A natural improvement would be to be able to specify 
> one particular cookie to check the expiration-date against. If anyone is 
> interested in this beside me I'd be glad to put some more effort into making 
> this more universally applicable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to