This also happens to me quite a bit. I cut down the thread count to a smaller size and that seems to help, but it's random (I'm using trunk)
On Wed, Jun 30, 2010 at 3:17 PM, reinhard schwab <[email protected]>wrote: > connect with jconsole to the java vm of nutch and look at the stack > traces of the threads. > you will get more info there. > > Claudio Martella schrieb: > > Hello, > > > > I'm using nutch 1.1 (with crawl command) to crawl an intranet document > > archive via webdav. At the end of each fetch phase the fetcher hungs > > like this: > > > > -activeThreads=5, spinWaiting=0, fetchQueues.totalSize=250 > > > > from my analysis of network traffic, nothing is passing by. The logs > show: > > > > 2010-06-30 13:38:35,335 INFO fetcher.Fetcher - fetching > > > https://192.168.10.10/data/public/50.90_In_Bearbeitung/Stefano%20P/normen2010/normen2010.indd > > 2010-06-30 13:38:35,381 INFO auth.AuthChallengeProcessor - basic > > authentication scheme selected > > 2010-06-30 13:38:35,819 INFO fetcher.Fetcher - -activeThreads=5, > > spinWaiting=0, fetchQueues.totalSize=249 > > 2010-06-30 13:38:36,824 INFO fetcher.Fetcher - -activeThreads=5, > > spinWaiting=0, fetchQueues.totalSize=250 > > > > which i guess means i finish downloading the specified file and then it > > hungs until: > > > > 2010-06-30 13:43:35,963 INFO fetcher.Fetcher - -activeThreads=5, > > spinWaiting=0, fetchQueues.totalSize=250 > > 2010-06-30 13:43:35,963 WARN fetcher.Fetcher - Aborting with 5 hung > > threads. > > > > so basically 5 minutes without doing anything. > > > > this is my configuration in nutch-site.xml related to fetcher: > > > > <property> > > <name>fetcher.server.delay</name> > > <value>0.0</value> > > </property> > > > > <property> > > <name>fetcher.server.min.delay</name> > > <value>0.0</value> > > </property> > > > > <property> > > <name>fetcher.threads.fetch</name> > > <value>5</value> > > </property> > > > > <property> > > <name>fetcher.threads.per.host</name> > > <value>5</value> > > </property> > > > > <property> > > <name>fetcher.threads.per.host.by.ip</name> > > <value>false</value> > > </property> > > > > Any idea why this is happening? > > > > > > Thanks > > > > > > Claudio > > > > > >

