[Nutch Wiki] Trivial Update of "RunningNutchAndSolr" by LewisJohnMcgibbney

Apache Wiki Fri, 02 Sep 2011 12:22:13 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.


The "RunningNutchAndSolr" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/RunningNutchAndSolr?action=diff&rev1=69&rev2=70

  }}} 
  
  This will include any url in the domain nutch.apache.org.
+ 
+ Now we are ready to initiate a crawl, use the following parameters:
+ 
+  * '''-dir''' ''dir'' names the directory to put the crawl in.
+  * '''-threads''' ''threads'' determines the number of threads that will 
fetch in parallel.
+  * '''-depth''' ''depth'' indicates the link depth from the root page that 
should be crawled.
+  * '''-topN''' ''N'' determines the maximum number of pages that will be 
retrieved at each level up to the depth.
   * Run the following command:
  {{{
  bin/nutch crawl urls -dir crawl -depth 3 -topN 5

[Nutch Wiki] Trivial Update of "RunningNutchAndSolr" by LewisJohnMcgibbney

Reply via email to