Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by Gal Nitzan:
http://wiki.apache.org/nutch/FAQ

------------------------------------------------------------------------------
  
  There's a user, developer, commits and agents lists, all available at 
http://lucene.apache.org/nutch/mailing_lists.html#Agents .
  
+ ==== Is there a mail archive? ====
+ 
+ Yes: http://www.mail-archive.com/nutch-user%40lucene.apache.org/maillist.html 
.
+ 
  ==== My system does not find the segments folder. Why? OR How do I tell the 
''Nutch Servlet'' where the index file are located? ====
  
  There are at least two choices to do that:
  
-   1) First you need to copy the .WAR file to the servlet container webapps 
folder.
+   First you need to copy the .WAR file to the servlet container webapps 
folder.
       % cp nutch-0.7.war $CATALINA_HOME/webapps/ROOT.war
  
    * After building your first index, start Tomcat from the index folder.
@@ -32, +36 @@

      Edit the nutch-default.xml which is located at:
         $CATATALINA_HOME/bin/webapps/ROOT/WEB-INF/classes/
         look for the entry: searcher.dir and replace it with your index 
location /index/db
+ 
+ ==== I have two XML files, nutch-default.xml and nutch-site.xml, why? ====
+ 
+ nutch-default.xml is the out of the box configuration for nutch. Most 
configuration can (and should unless you know what your doing) stay as it is.
+ nutch-site.xml is where you make the changes that override the default 
settings.
+ The same goes to the servlet container application.
  
  === Injecting ===
  
@@ -70, +80 @@

    * Set NUTCH_CONF_DIR environment variable to point into the directory you 
created
    * run $NUTCH_HOME/bin/nutch so that it gets the NUTCH_CONF_DIR environment 
variable. You should check the command outputs for lines where the configs are 
loaded, that they are really loaded from your custom dir.
    * Happy using.
+ 
+ ==== While fetching I get UnknownHostException for known hosts ====
+ 
+ Make sure your DNS server is working and/or it can handle the load of 
requests.
  
  === Updating ===
  

Reply via email to