connect with jconsole to the java vm of nutch and look at the stack
traces of the threads.
you will get more info there.

Claudio Martella schrieb:
> Hello,
>
> I'm using nutch 1.1 (with crawl command) to crawl an intranet document
> archive via webdav. At the end of each fetch phase the fetcher hungs
> like this:
>
> -activeThreads=5, spinWaiting=0, fetchQueues.totalSize=250
>
> from my analysis of network traffic, nothing is passing by. The logs show:
>
> 2010-06-30 13:38:35,335 INFO  fetcher.Fetcher - fetching
> https://192.168.10.10/data/public/50.90_In_Bearbeitung/Stefano%20P/normen2010/normen2010.indd
> 2010-06-30 13:38:35,381 INFO  auth.AuthChallengeProcessor - basic
> authentication scheme selected
> 2010-06-30 13:38:35,819 INFO  fetcher.Fetcher - -activeThreads=5,
> spinWaiting=0, fetchQueues.totalSize=249
> 2010-06-30 13:38:36,824 INFO  fetcher.Fetcher - -activeThreads=5,
> spinWaiting=0, fetchQueues.totalSize=250
>
> which i guess means i finish downloading the specified file and then it
> hungs until:
>
> 2010-06-30 13:43:35,963 INFO  fetcher.Fetcher - -activeThreads=5,
> spinWaiting=0, fetchQueues.totalSize=250
> 2010-06-30 13:43:35,963 WARN  fetcher.Fetcher - Aborting with 5 hung
> threads.
>
> so basically 5 minutes without doing anything.
>
> this is my configuration in nutch-site.xml related to fetcher:
>
> <property>
>   <name>fetcher.server.delay</name>
>   <value>0.0</value>
> </property>
>
> <property>
>   <name>fetcher.server.min.delay</name>
>   <value>0.0</value>
> </property>
>
> <property>
>   <name>fetcher.threads.fetch</name>
>   <value>5</value>
> </property>
>
> <property>
>   <name>fetcher.threads.per.host</name>
>   <value>5</value>
> </property>
>
> <property>
>   <name>fetcher.threads.per.host.by.ip</name>
>   <value>false</value>
> </property>
>
> Any idea why this is happening?
>
>
> Thanks
>
>
> Claudio
>
>   

Reply via email to