Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by JakeVanderdray:
http://wiki.apache.org/nutch/NutchTutorial

------------------------------------------------------------------------------
  Now we're ready to crawl. There are two approaches to crawling:
  
   1. Using the '''crawl''' command to perform all the crawl steps with a 
single command.  This is sometimes referred to as '''Intranet Crawling'''.  
Although a simple way to get started, it has limitations.
-  2. Using the lower level inject, generate, fetch and updatedb commands.  
Sometimes refferred to as '''Whole-Web Crawling''' this allows you more control 
of each step of the process and is required to be able to update existing data.
+  2. Using the lower level inject, generate, fetch and updatedb commands.  
Sometimes referred to as '''Whole-Web Crawling''' this allows you more control 
of each step of the process and is required to be able to update existing data.
  
  == The Crawl Command ==
  
- The crawl comamnd is more appropriate when you intend to crawl up to around 
one million pages on a handful of web servers.
+ The crawl command is more appropriate when you intend to crawl up to around 
one million pages on a handful of web servers.
  
  === Crawl Command: Configuration ===
  

Reply via email to