hi Ravi, Clarence, if the patch is working, could you please add it to JIRA, thanks.
- Renaud Ravi Chintakunta wrote:
Hi Clarence, The properties entered in the nutch-site.xml does not seem to be used in HttpClient. Please apply the below patch to nutch/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java and that should help. - Ravi Chintakunta @@ -31,6 +31,7 @@ import org.apache.commons.httpclient.HttpClient; import org.apache.commons.httpclient.MultiThreadedHttpConnectionManager; import org.apache.commons.httpclient.NTCredentials; +import org.apache.commons.httpclient.UsernamePasswordCredentials; import org.apache.commons.httpclient.auth.AuthScope; import org.apache.commons.httpclient.params.HttpConnectionManagerParams; import org.apache.commons.httpclient.protocol.Protocol; @@ -65,6 +66,8 @@ String ntlmPassword = ""; String ntlmDomain = ""; String ntlmHost = ""; + String basicUsername = ""; + String basicPassword = ""; public Http() { super(LOG); @@ -77,6 +80,8 @@ this.ntlmPassword = conf.get("http.auth.ntlm.password", ""); this.ntlmDomain = conf.get("http.auth.ntlm.domain", ""); this.ntlmHost = conf.get("http.auth.ntlm.host", ""); + basicUsername = conf.get("http.auth.basic.username"); + basicPassword = conf.get("http.auth.basic.password"); //Level logLevel = Level.WARNING; //if (conf.getBoolean("http.verbose", false)) { // logLevel = Level.FINE; @@ -131,6 +136,7 @@ if (useProxy) { hostConf.setProxy(proxyHost, proxyPort); } + /* if (ntlmUsername.length() > 0) { Credentials ntCreds = new NTCredentials(ntlmUsername, ntlmPassword, ntlmHost, ntlmDomain); client.getState().setCredentials(new AuthScope(ntlmHost, AuthScope.ANY_PORT), ntCreds); @@ -139,6 +145,11 @@ LOG.info("Added NTLM credentials for " + ntlmUsername); } } + */ + + client.getParams().setAuthenticationPreemptive(true); + if (LOG.isInfoEnabled()) { LOG.info("**** setting basic auth credentials ****"); } + client.getState().setCredentials(new AuthScope("linuxlink.timesys.com", AuthScope.ANY_PORT, AuthScope.ANY_REALM), new UsernamePasswordCrede ntials(basicUsername, basicPassword)); if (LOG.isInfoEnabled()) { LOG.info("Configured Client"); } } } On 8/6/07, Clarence Donath <[EMAIL PROTECTED]> wrote:Is HTTP Basic authentication working at all? I've been working with v0.9 for two days now, and I have yet to get this working. I have one test directory with an .htaccess file requiring a username:password just for the fetcher. I can access this directory with a browser using that username:password. In nutch-site.xml I have replaced 'protocol-http' with 'protocol-httpclient' in the 'plugin.includes' property. and the following... <property> <name>http.auth.basic.IT.user</name> <value>spider</value> <description>HTTP Basic Authentication</description> </property> <property> <name>http.auth.basic.IT.pass</name> <value>pissword</value> <description>HTTP Basic Authentication</description> </property> 'IT' is the realm (AuthName "IT"). I've tried defining these properties as 'http.auth.basic.IT.user', 'http.auth.basic..user', and 'http.auth.basic.user'. as I've discovered in several others' examples in the Nutch Wiki. I see this in hadoop.log... 2007-08-06 16:12:45,856 INFO httpclient.HttpMethodDirector - No credentials available for BASIC 'IT'@spock.abaqus.com:80 I see the fetcher hitting the server, but it never tries the 'spider' user to authenticate... 172.17.25.27 - - [06/Aug/2007:16:12:45 -0400] "GET /development HTTP/1.0" 401 1287 "-" "ABAQUS/Nutch-0.9 (moin; http://spock; [EMAIL PROTECTED])" Please tell me whether I should expect the basic authentication mechanism to work at all. I've already spent so much time trying to figure this out. Regards, Clarence Donath Spelling is a lossed art.
