We are continuing to have this problem of HTTP 407 authenticating with the 
proxy.
I got the sysadmin to monitor the logs, and the logs throw this when nutch 
tries to crawl:

Aug 26 18:36:30 blrwcg01 content_gateway[10059]: NOTE: [4112998] winauth 
EVENT_NTLM_LOGON_DENIED ip:10.212.51.13, reason:(NTLM) NA 
NT_STATUS_WRONG_PASSWORD, log:Got user=[502047] domain=[] workstation=[] 
len1=24 len2=0#012Login for user []\[502047]@[] failed due to [Wrong Password]

It seems as though the password is not going correctly to the proxy server. I 
have set all required proxy parameters correctly in nutch-site.xml.

Any clues?

Suresh.

-----Original Message-----
From: Lewis John Mcgibbney [mailto:[email protected]] 
Sent: Wednesday, June 05, 2013 11:28 AM
To: [email protected]
Subject: Nutch not crawling fully

Hi,

It is clear that for the configuration you are running NTLM is not 
authenticating properly.

I would run the Http class with TRACE logging activated, this will show the 
credentials you are after.
You should also note the documentation in nutch-default.xml which explicitly 
states "NOTE: For NTLM authentication, do not prefix the username with the 
domain, i.e. 'susam' is correct whereas 'DOMAIN\susam' is incorrect."... 
looking at your log this does not seem to be the case.

http://svn.apache.org/repos/asf/nutch/trunk/src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/Http.java

--
*Lewis*

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Disclaimer~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Information contained and transmitted by this e-mail is confidential and 
proprietary to iGATE and its affiliates and is intended for use only by the 
recipient. If you are not the intended recipient, you are hereby notified that 
any dissemination, distribution, copying or use of this e-mail is strictly 
prohibited and you are requested to delete this e-mail immediately and notify 
the originator or [email protected] <mailto:[email protected]>. iGATE does 
not enter into any agreement with any party by e-mail. Any views expressed by 
an individual do not necessarily reflect the view of iGATE. iGATE is not 
responsible for the consequences of any actions taken on the basis of 
information provided, through this email. The contents of an attachment to this 
e-mail may contain software viruses, which could damage your own computer 
system. While iGATE has taken every reasonable precaution to minimise this 
risk, we cannot accept liability for any damage which you sustain as a result 
of software viruses. You should carry out your own virus checks before opening 
an attachment. To know more about iGATE please visit www.igate.com 
<http://www.igate.com>.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reply via email to