Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by MatthiasGuenter: http://wiki.apache.org/nutch/FAQ ------------------------------------------------------------------------------ ==== It seems as if not all links are followed in the pages in my URL lists ==== 1.) Make sure that your expressions in conf/crawl-urlfilter.txt are correct, perhaps the links are dropped there. + 2.) Make sure that in conf/nutch-site.xml the following parameters are set appropriate: + * http.content.limit: otherwise some content my never be fetched at all * db.max.outlinks.per.page: otherwise the links might be dropped. + 3.) Make sure you have the parse-js and all other necessary plugins active in conf/nutch-site.xml === Updating === ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-cvs mailing list Nutch-cvs@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-cvs