Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by JeffRitchie: http://wiki.apache.org/nutch/nutch-0%2e8-dev/bin/nutch_crawl New page: = nutch-0.8-dev/bin/nutch crawl = == "crawl" is an alias for "org.apache.nutch.crawl.Crawl" == === Perform complete crawling and indexing given a set of root urls. === '''Usage:''' nutch-0.8-dev/bin/nutch org.apache.nutch.crawl.Crawl <urlDir> [-dir d] [-threads n] [-depth i] [-topN] '''<urlDir>:''' contains text files with URL lists. This must be an existing directory. '''[-dir d]:''' You can choose the directory, where Nutch should save the index. If you donât choose a directory Nutch would create a own directory in the directory where you started the crawl. Example of a âdir parameter: -dir /usr/local/index/ '''[-threads n]:''' ''<need description>'' '''[-depth i]:''' You can tell Nutch how deep it should crawl. If you donât tell Nutch a value, it takes 3 as his standard parameter. For example if you say âdepth 1, Nutch would only index the first level. Only if you say âdepth 2 (or more) Nutch would make a link follow. '''[-topN]:''' ''<need description>'' DevelopmentCommandLineOptions
