On Fri, Mar 12, 2010 at 2:09 PM, Graziano Aliberti <graziano.alibe...@eng.it> wrote: > Il 11/03/2010 16.20, Susam Pal ha scritto: >> >> On Thu, Mar 11, 2010 at 8:24 PM, Graziano Aliberti >> <graziano.alibe...@eng.it> wrote: >> >>> >>> Hi everyone, >>> >>> I'm trying to use nutch ver. 1.0 on a system under squid proxy control. >>> When >>> I try to fetch my website list, into the log file I see that the >>> authentication was failed... >>> >>> I've configured my nutch-site.xml file with all that properties needed >>> for >>> proxy auth, but my error is "httpclient.HttpMethodDirector - No >>> credentials >>> available for BASIC 'Squid proxy-caching web >>> server'@proxy.my.host:my.port" >>> >>> >> >> Did you replace 'protocol-http' with 'protocol-httpclient' in the >> value for 'plugins.include' property in 'conf/nutch-site.xml'? >> >> Regards, >> Susam Pal >> >> >> > > Hi Susam, > > yes of course!! :) Maybe I can post you the configuration file: > > <?xml version="1.0"?> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <!-- Put site-specific property overrides in this file. --> > > <configuration> > > <property> > <name>http.agent.name</name> > <value>my.agent.name</value> > <description> > </description> > </property> > > <property> > <name>plugin.includes</name> > <value>protocol-httpclient|urlfilter-regex|parse-(text|html|js)|index-(basic|anchor)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)</value> > <description> > </description> > </property> > > <property> > <name>http.auth.file</name> > <value>my_file.xml</value> > <description>Authentication configuration file for > 'protocol-httpclient' plugin. > </description> > </property> > > <property> > <name>http.proxy.host</name> > <value>ip.my.proxy</value> > <description>The proxy hostname. If empty, no proxy is used.</description> > </property> > > <property> > <name>http.proxy.port</name> > <value>my.port</value> > <description>The proxy port.</description> > </property> > > <property> > <name>http.proxy.username</name> > <value>my.user</value> > <description> > </description> > </property> > > <property> > <name>http.proxy.password</name> > <value>my.pwd</value> > <description> > </description> > </property> > > <property> > <name>http.proxy.realm</name> > <value>my_realm</value> > <description> > </description> > </property> > > <property> > <name>http.agent.host</name> > <value>my.local.pc</value> > <description>The agent host.</description> > </property> > > <property> > <name>http.useHttp11</name> > <value>true</value> > <description> > </description> > </property> > > </configuration> > > Only another question: where i must put the user authentication parameters > (user,pwd)? In nutch-site.xml file or in my_file.xml that I use for > authentication? > > Thank you for your attention, > > > -- > ----------- > > Graziano Aliberti > > Engineering Ingegneria Informatica S.p.A > > Via S. Martino della Battaglia, 56 - 00185 ROMA > > *Tel.:* 06.49.201.387 > > *E-Mail:* graziano.alibe...@eng.it > > >
The configuration looks okay to me. Yes, the proxy authentication details are set in 'conf/nutch-site.xml'. The file mentioned in 'http.auth.file' property is used for configuring authentication details for authenticating to a web server. Unfortunately, there aren't any log statements in the part of the code that reads the proxy authentication details. So, I can't suggest you to turn on debug logs to get some clues about the issue. However, in case you want to troubleshoot it yourself by building Nutch from source, I can tell you the code that deals with this. The file is: src/java/org/apache/nutch/protocol/httpclient/Http.java : http://svn.apache.org/viewvc/lucene/nutch/trunk/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java?view=markup The line number is: 200. If I get time this weekend, I will try to insert some log statements into this code and send a modified JAR file to you which might help you to troubleshoot what is going on. But I can't promise this since it depends on my weekend plans. Two questions before I end this mail. Did you set the value of 'http.proxy.realm' property as: Squid proxy-caching web server ? Also, do you see any 'auth.AuthChallengeProcessor' lines in the log file? I'm not sure whether this line should appear for proxy authentication but it does appear for web server authentication. Regards, Susam Pal