connect with jconsole to the java vm of nutch and look at the stack traces of the threads. you will get more info there.
Claudio Martella schrieb: > Hello, > > I'm using nutch 1.1 (with crawl command) to crawl an intranet document > archive via webdav. At the end of each fetch phase the fetcher hungs > like this: > > -activeThreads=5, spinWaiting=0, fetchQueues.totalSize=250 > > from my analysis of network traffic, nothing is passing by. The logs show: > > 2010-06-30 13:38:35,335 INFO fetcher.Fetcher - fetching > https://192.168.10.10/data/public/50.90_In_Bearbeitung/Stefano%20P/normen2010/normen2010.indd > 2010-06-30 13:38:35,381 INFO auth.AuthChallengeProcessor - basic > authentication scheme selected > 2010-06-30 13:38:35,819 INFO fetcher.Fetcher - -activeThreads=5, > spinWaiting=0, fetchQueues.totalSize=249 > 2010-06-30 13:38:36,824 INFO fetcher.Fetcher - -activeThreads=5, > spinWaiting=0, fetchQueues.totalSize=250 > > which i guess means i finish downloading the specified file and then it > hungs until: > > 2010-06-30 13:43:35,963 INFO fetcher.Fetcher - -activeThreads=5, > spinWaiting=0, fetchQueues.totalSize=250 > 2010-06-30 13:43:35,963 WARN fetcher.Fetcher - Aborting with 5 hung > threads. > > so basically 5 minutes without doing anything. > > this is my configuration in nutch-site.xml related to fetcher: > > <property> > <name>fetcher.server.delay</name> > <value>0.0</value> > </property> > > <property> > <name>fetcher.server.min.delay</name> > <value>0.0</value> > </property> > > <property> > <name>fetcher.threads.fetch</name> > <value>5</value> > </property> > > <property> > <name>fetcher.threads.per.host</name> > <value>5</value> > </property> > > <property> > <name>fetcher.threads.per.host.by.ip</name> > <value>false</value> > </property> > > Any idea why this is happening? > > > Thanks > > > Claudio > >

