Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The "NutchTutorial" page has been changed by riverma: https://wiki.apache.org/nutch/NutchTutorial?action=diff&rev1=69&rev2=70 Nutch developers have written one for you :), and it is available at [[bin/crawl]]. {{{ - Usage: bin/crawl <seedDir> <crawlID> <solrURL> <numberOfRounds> + Usage: bin/crawl <seedDir> <crawlDir> <solrURL> <numberOfRounds> - Example: bin/crawl urls/seed.txt TestCrawl http://localhost:8983/solr/ 2 + Example: bin/crawl urls/ TestCrawl/ http://localhost:8983/solr/ 2 - Or you can use: - Example: bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 3 -topN 5 }}} The crawl script has lot of parameters set, and you can modify the parameters to your needs. It would be ideal to understand the parameters before setting up big crawls.

