The error message is self explanatory. The connection timed out since you did not receive a response within 10 seconds. You may consider increasing the timeout. Check the 'http.timeout' property in 'conf/nutch-default.xml'. Override it in 'conf/nutch-site.xml'.
I could not understand what you mean by "not also getting http.proxy.realm". You are not getting this in which file? It would be better if you could also mention which version of Nutch you are using and what additional patches you have applied. NTLM is an authentication scheme like Basic authentication and Digest authentication. If you have applied the patch in JIRA NUTCH-559 <https://issues.apache.org/jira/browse/NUTCH-559>, you should find a 'http.proxy.realm' property in 'conf/nutch-default.xml' around line number 190. You have to override this in 'conf/nutch-site.xml'. For Basic or Digest authentication, you have to mention the realm as its value. For NTLM authentication, you have to mention the domain as its value (because there is no concept of realms in NTLM). Regards, Susam Pal On Jan 1, 2008 5:27 PM, NIDHI MALIK <[EMAIL PROTECTED]> wrote: > > Hello, > > Thanks Susam for the help. After making all the changes suggested > by you at the time of crawling following error is displayed: > > > fetch of http://www.w3schools.com/ failed with: > org.apache.commons.httpclient.ConnectTimeoutException: The host did not > accept the connection within timeout of 10000 ms. > > I am not also getting http.proxy.realm, and what is NTLM? > > <property> > <name>http.proxy.realm</name> > <value></value> > <description>Authentication realm for proxy. Do not define a value if > realm is not required or authentication should take place for any realm. > NTLM does not use the notion of realms. Specify the domain name of NTLM > authentication as the value for this property. To use this, > 'protocol-httpclient' must be present in the value of > 'plugin.includes' property. > </description> > </property> > > what value we have to set here? > > Thanks > >
