[
https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466741#comment-13466741
]
Jasper van Veghel edited comment on NUTCH-827 at 10/1/12 10:39 PM:
-------------------------------------------------------------------
Looks like a pretty sloppy mistake in the patch .. ;-)
<pre>
+ if (code == 200 && Http.LOG.isTraceEnabled()) {
+ Http.LOG.trace("url: " + url +
+ "; status code: " + code +
+ "; cookies received: " +
Http.getClient().getState().getCookies().length);
+ } else {
+ Http.LOG.error("Unable to retrieve login page; code = " + code);
+ }
</pre>
Change that to something like ..
<pre>
+ if (code == 200 && Http.LOG.isTraceEnabled()) {
+ Http.LOG.trace("url: " + url +
+ "; status code: " + code +
+ "; cookies received: " +
Http.getClient().getState().getCookies().length);
+ } else if (code != 200) {
+ Http.LOG.error("Unable to retrieve login page; code = " + code);
+ }
</pre>
And also change this ..
<pre>
+ LOG.error("Cookie-based authentication failed; cookies will not be
present for this request but an attempt to retrieve them will be made for the
next one.");
</pre>
To something like this ..
<pre>
+ LOG.error("Cookie-based authentication failed; cookies will not be
present for this request but an attempt to retrieve them will be made for the
next one.", e);
</pre>
To see where the Exception is coming from. All it does after that LOG.error()
is release the connection. So it shouldn't be throwing an Exception.
was (Author: jaspervanveghel):
Looks like a pretty sloppy mistake in the patch .. ;-)
+ if (code == 200 && Http.LOG.isTraceEnabled()) {
+ Http.LOG.trace("url: " + url +
+ "; status code: " + code +
+ "; cookies received: " +
Http.getClient().getState().getCookies().length);
+ } else {
+ Http.LOG.error("Unable to retrieve login page; code = " + code);
+ }
Change that to something like ..
+ if (code == 200 && Http.LOG.isTraceEnabled()) {
+ Http.LOG.trace("url: " + url +
+ "; status code: " + code +
+ "; cookies received: " +
Http.getClient().getState().getCookies().length);
+ } else if (code != 200) {
+ Http.LOG.error("Unable to retrieve login page; code = " + code);
+ }
And also change this ..
+ LOG.error("Cookie-based authentication failed; cookies will not be
present for this request but an attempt to retrieve them will be made for the
next one.");
To something like this ..
+ LOG.error("Cookie-based authentication failed; cookies will not be
present for this request but an attempt to retrieve them will be made for the
next one.", e);
To see where the Exception is coming from. All it does after that LOG.error()
is release the connection. So it shouldn't be throwing an Exception.
> HTTP POST Authentication
> ------------------------
>
> Key: NUTCH-827
> URL: https://issues.apache.org/jira/browse/NUTCH-827
> Project: Nutch
> Issue Type: New Feature
> Components: fetcher
> Affects Versions: 1.1, nutchgora
> Reporter: Jasper van Veghel
> Priority: Minor
> Labels: authentication
> Fix For: 1.6
>
> Attachments: nutch-http-cookies.patch
>
>
> I've created a patch against the trunk which adds support for very
> rudimentary POST-based authentication support. It takes a link from
> nutch-site.xml with a site to POST to and its respective parameters
> (username, password, etc.). It then checks upon every request whether any
> cookies have been initialized, and if none have, it fetches them from the
> given link.
> This isn't perfect but Works For Me (TM) as I generally only need to retrieve
> results from a single domain and so have no cookie overlap (i.e. if the
> domain cookies expire, all cookies disappear from the HttpClient and I can
> simply re-fetch them). A natural improvement would be to be able to specify
> one particular cookie to check the expiration-date against. If anyone is
> interested in this beside me I'd be glad to put some more effort into making
> this more universally applicable.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira