Hi Clarence,

The properties entered in the nutch-site.xml does not seem to be used
in HttpClient. Please apply the below patch to
nutch/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java
and that should help.

- Ravi Chintakunta



@@ -31,6 +31,7 @@
 import org.apache.commons.httpclient.HttpClient;
 import org.apache.commons.httpclient.MultiThreadedHttpConnectionManager;
 import org.apache.commons.httpclient.NTCredentials;
+import org.apache.commons.httpclient.UsernamePasswordCredentials;
 import org.apache.commons.httpclient.auth.AuthScope;
 import org.apache.commons.httpclient.params.HttpConnectionManagerParams;
 import org.apache.commons.httpclient.protocol.Protocol;
@@ -65,6 +66,8 @@
   String ntlmPassword = "";
   String ntlmDomain = "";
   String ntlmHost = "";
+  String basicUsername = "";
+  String basicPassword = "";

   public Http() {
     super(LOG);
@@ -77,6 +80,8 @@
     this.ntlmPassword = conf.get("http.auth.ntlm.password", "");
     this.ntlmDomain = conf.get("http.auth.ntlm.domain", "");
     this.ntlmHost = conf.get("http.auth.ntlm.host", "");
+    basicUsername = conf.get("http.auth.basic.username");
+    basicPassword = conf.get("http.auth.basic.password");
     //Level logLevel = Level.WARNING;
     //if (conf.getBoolean("http.verbose", false)) {
     //  logLevel = Level.FINE;
@@ -131,6 +136,7 @@
     if (useProxy) {
       hostConf.setProxy(proxyHost, proxyPort);
     }
+    /*
     if (ntlmUsername.length() > 0) {
       Credentials ntCreds = new NTCredentials(ntlmUsername,
ntlmPassword, ntlmHost, ntlmDomain);
       client.getState().setCredentials(new AuthScope(ntlmHost,
AuthScope.ANY_PORT), ntCreds);
@@ -139,6 +145,11 @@
         LOG.info("Added NTLM credentials for " + ntlmUsername);
       }
     }
+    */
+
+    client.getParams().setAuthenticationPreemptive(true);
+    if (LOG.isInfoEnabled()) { LOG.info("**** setting basic auth
credentials ****"); }
+    client.getState().setCredentials(new
AuthScope("linuxlink.timesys.com", AuthScope.ANY_PORT,
AuthScope.ANY_REALM), new UsernamePasswordCrede
ntials(basicUsername, basicPassword));
     if (LOG.isInfoEnabled()) { LOG.info("Configured Client"); }
   }
 }




On 8/6/07, Clarence Donath <[EMAIL PROTECTED]> wrote:
> Is HTTP Basic authentication working at all?
>
> I've been working with v0.9 for two days now, and I have yet to get this
> working.
>
> I have one test directory with an .htaccess file requiring a
> username:password just for the fetcher.
>
> I can access this directory with a browser using that username:password.
>
> In nutch-site.xml I have replaced 'protocol-http' with
> 'protocol-httpclient' in the 'plugin.includes' property.
>
> and the following...
>
> <property>
>   <name>http.auth.basic.IT.user</name>
>   <value>spider</value>
>   <description>HTTP Basic Authentication</description>
> </property>
>
> <property>
>   <name>http.auth.basic.IT.pass</name>
>   <value>pissword</value>
>   <description>HTTP Basic Authentication</description>
> </property>
>
> 'IT' is the realm (AuthName "IT").
>
> I've tried defining these properties as 'http.auth.basic.IT.user',
> 'http.auth.basic..user', and 'http.auth.basic.user'. as I've discovered
> in several others' examples in the Nutch Wiki.
>
> I see this in hadoop.log...
>
> 2007-08-06 16:12:45,856 INFO  httpclient.HttpMethodDirector - No
> credentials available for BASIC 'IT'@spock.abaqus.com:80
>
> I see the fetcher hitting the server, but it never tries the 'spider'
> user to authenticate...
>
> 172.17.25.27 - - [06/Aug/2007:16:12:45 -0400] "GET /development
> HTTP/1.0" 401 1287 "-" "ABAQUS/Nutch-0.9 (moin; http://spock;
> [EMAIL PROTECTED])"
>
>
> Please tell me whether I should expect the basic authentication
> mechanism to work at all.  I've already spent so much time trying to
> figure this out.
>
> Regards,
> Clarence Donath
>
>
> Spelling is a lossed art.
>

Reply via email to