Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The "RunningNutchAndSolr" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/RunningNutchAndSolr?action=diff&rev1=69&rev2=70 }}} This will include any url in the domain nutch.apache.org. + + Now we are ready to initiate a crawl, use the following parameters: + + * '''-dir''' ''dir'' names the directory to put the crawl in. + * '''-threads''' ''threads'' determines the number of threads that will fetch in parallel. + * '''-depth''' ''depth'' indicates the link depth from the root page that should be crawled. + * '''-topN''' ''N'' determines the maximum number of pages that will be retrieved at each level up to the depth. * Run the following command: {{{ bin/nutch crawl urls -dir crawl -depth 3 -topN 5

