Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by Gal Nitzan: http://wiki.apache.org/nutch/FAQ ------------------------------------------------------------------------------ There's a user, developer, commits and agents lists, all available at http://lucene.apache.org/nutch/mailing_lists.html#Agents . + ==== Is there a mail archive? ==== + + Yes: http://www.mail-archive.com/nutch-user%40lucene.apache.org/maillist.html . + ==== My system does not find the segments folder. Why? OR How do I tell the ''Nutch Servlet'' where the index file are located? ==== There are at least two choices to do that: - 1) First you need to copy the .WAR file to the servlet container webapps folder. + First you need to copy the .WAR file to the servlet container webapps folder. % cp nutch-0.7.war $CATALINA_HOME/webapps/ROOT.war * After building your first index, start Tomcat from the index folder. @@ -32, +36 @@ Edit the nutch-default.xml which is located at: $CATATALINA_HOME/bin/webapps/ROOT/WEB-INF/classes/ look for the entry: searcher.dir and replace it with your index location /index/db + + ==== I have two XML files, nutch-default.xml and nutch-site.xml, why? ==== + + nutch-default.xml is the out of the box configuration for nutch. Most configuration can (and should unless you know what your doing) stay as it is. + nutch-site.xml is where you make the changes that override the default settings. + The same goes to the servlet container application. === Injecting === @@ -70, +80 @@ * Set NUTCH_CONF_DIR environment variable to point into the directory you created * run $NUTCH_HOME/bin/nutch so that it gets the NUTCH_CONF_DIR environment variable. You should check the command outputs for lines where the configs are loaded, that they are really loaded from your custom dir. * Happy using. + + ==== While fetching I get UnknownHostException for known hosts ==== + + Make sure your DNS server is working and/or it can handle the load of requests. === Updating ===
