Hi Clarence,
The properties entered in the nutch-site.xml does not seem to be used
in HttpClient. Please apply the below patch to
nutch/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java
and that should help.
- Ravi Chintakunta
@@ -31,6 +31,7 @@
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.MultiThreadedHttpConnectionManager;
import org.apache.commons.httpclient.NTCredentials;
+import org.apache.commons.httpclient.UsernamePasswordCredentials;
import org.apache.commons.httpclient.auth.AuthScope;
import org.apache.commons.httpclient.params.HttpConnectionManagerParams;
import org.apache.commons.httpclient.protocol.Protocol;
@@ -65,6 +66,8 @@
String ntlmPassword = "";
String ntlmDomain = "";
String ntlmHost = "";
+ String basicUsername = "";
+ String basicPassword = "";
public Http() {
super(LOG);
@@ -77,6 +80,8 @@
this.ntlmPassword = conf.get("http.auth.ntlm.password", "");
this.ntlmDomain = conf.get("http.auth.ntlm.domain", "");
this.ntlmHost = conf.get("http.auth.ntlm.host", "");
+ basicUsername = conf.get("http.auth.basic.username");
+ basicPassword = conf.get("http.auth.basic.password");
//Level logLevel = Level.WARNING;
//if (conf.getBoolean("http.verbose", false)) {
// logLevel = Level.FINE;
@@ -131,6 +136,7 @@
if (useProxy) {
hostConf.setProxy(proxyHost, proxyPort);
}
+ /*
if (ntlmUsername.length() > 0) {
Credentials ntCreds = new NTCredentials(ntlmUsername,
ntlmPassword, ntlmHost, ntlmDomain);
client.getState().setCredentials(new AuthScope(ntlmHost,
AuthScope.ANY_PORT), ntCreds);
@@ -139,6 +145,11 @@
LOG.info("Added NTLM credentials for " + ntlmUsername);
}
}
+ */
+
+ client.getParams().setAuthenticationPreemptive(true);
+ if (LOG.isInfoEnabled()) { LOG.info("**** setting basic auth
credentials ****"); }
+ client.getState().setCredentials(new
AuthScope("linuxlink.timesys.com", AuthScope.ANY_PORT,
AuthScope.ANY_REALM), new UsernamePasswordCrede
ntials(basicUsername, basicPassword));
if (LOG.isInfoEnabled()) { LOG.info("Configured Client"); }
}
}
On 8/6/07, Clarence Donath <[EMAIL PROTECTED]> wrote:
> Is HTTP Basic authentication working at all?
>
> I've been working with v0.9 for two days now, and I have yet to get this
> working.
>
> I have one test directory with an .htaccess file requiring a
> username:password just for the fetcher.
>
> I can access this directory with a browser using that username:password.
>
> In nutch-site.xml I have replaced 'protocol-http' with
> 'protocol-httpclient' in the 'plugin.includes' property.
>
> and the following...
>
> <property>
> <name>http.auth.basic.IT.user</name>
> <value>spider</value>
> <description>HTTP Basic Authentication</description>
> </property>
>
> <property>
> <name>http.auth.basic.IT.pass</name>
> <value>pissword</value>
> <description>HTTP Basic Authentication</description>
> </property>
>
> 'IT' is the realm (AuthName "IT").
>
> I've tried defining these properties as 'http.auth.basic.IT.user',
> 'http.auth.basic..user', and 'http.auth.basic.user'. as I've discovered
> in several others' examples in the Nutch Wiki.
>
> I see this in hadoop.log...
>
> 2007-08-06 16:12:45,856 INFO httpclient.HttpMethodDirector - No
> credentials available for BASIC 'IT'@spock.abaqus.com:80
>
> I see the fetcher hitting the server, but it never tries the 'spider'
> user to authenticate...
>
> 172.17.25.27 - - [06/Aug/2007:16:12:45 -0400] "GET /development
> HTTP/1.0" 401 1287 "-" "ABAQUS/Nutch-0.9 (moin; http://spock;
> [EMAIL PROTECTED])"
>
>
> Please tell me whether I should expect the basic authentication
> mechanism to work at all. I've already spent so much time trying to
> figure this out.
>
> Regards,
> Clarence Donath
>
>
> Spelling is a lossed art.
>