The reason no one answered is because it has been answered before a 
couple of times.  If you do a search on this mailing list for fetcher 
slowness or fetcher hung threads you will get answers.  You can also 
take a look at NUTCH-344.  This problem has come up before and there are 
patches which fix this.  It has to do with crawl delays being set to a 
big value by the pages being fetched.  The configuration below is the 
nutch-site.xml file should fix this depending on the version of Nutch 
you are using.

<property>
 <name>fetcher.max.crawl.delay</name>
 <value>30</value>
 <description>
 If the Crawl-Delay in robots.txt is set to greater than this value (in
 seconds) then the fetcher will skip this page, generating an error report.
 If set to -1 the fetcher will never skip such pages and will wait the
 amount of time retrieved from robots.txt Crawl-Delay, however long that
 might be.
 </description>
</property>

Dennis

Aïcha wrote:
> Hi,
>
> I don't know why but I have no answer on the 3 forums where I sent my 
> problem........
> As the problem of Fetcher freezes occurs every time I try  to fetch my file 
> system I can't imagine that I am the only one who have this problem and as I 
> said in my last e-mail, I found many mails about this problem but no solution 
> seems have been done........
> It is a big problem so I don't understand why nobody seems interested on 
> it........
>
> can anyone tell me if he encountred the problem and how to do......... 
> thanks in advance.
> Aïcha
>
>
> ----- Message d'origine ----
> De : Aïcha <[EMAIL PROTECTED]>
> À : [email protected]
> Envoyé le : Lundi, 30 Octobre 2006, 18h16mn 26s
> Objet : Urgent : Fetcher aborts with hung threads
>
>
> Hi,
>
> I try to crawl over my file system but the crawl was never finished, it 
> aborted
> with the message "Aborting with 3 hung threads". 
>
> The number of hung threads is not the same if I retry....
>
> I see that the problem was posted many times and the last was by Bruno Thiel 
> the 2006/10/11,
> but I think it isn't linked with the xls files as the problem occurs after 
> different type of format.
>
> I modify the configuration grawing the number of threads but it doen't solved 
> the problem........ 
>
> Please could somebody help me,
> I can't crawl my file system..........
>
> Best Regards,
> Aïcha
>
>
>     
>
>     
>         
> ___________________________________________________________________________ 
> Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! 
> Profitez des connaissances, des opinions et des expériences des internautes 
> sur Yahoo! Questions/Réponses 
> http://fr.answers.yahoo.com
>
>
>       
>
>       
>               
> ___________________________________________________________________________ 
> Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! 
> Profitez des connaissances, des opinions et des expériences des internautes 
> sur Yahoo! Questions/Réponses 
> http://fr.answers.yahoo.com
>   

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to