Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by MatthiasGuenter:
http://wiki.apache.org/nutch/FAQ

------------------------------------------------------------------------------
  ==== While fetching I get UnknownHostException for known hosts ====
  
  Make sure your DNS server is working and/or it can handle the load of 
requests.
+ 
+ 
+ ==== It seems as if not all links are followed in the pages in my URL lists 
====
+ 
+ 1.) Make sure that your expressions in conf/crawl-urlfilter.txt are correct, 
perhaps the links are dropped there.
+ 2.) Make sure that in conf/nutch-site.xml the following parameters are set 
appropriate:
+ * http.content.limit: otherwise some content my never be fetched at all
+ * db.max.outlinks.per.page: otherwise the links might be dropped.
+ 3.) Make sure you have the parse-js and all other necessary plugins active in 
conf/nutch-site.xml
  
  === Updating ===
  

Reply via email to