Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by JeffRitchie: http://wiki.apache.org/nutch/nutch-0%2e8-dev/bin/nutch_crawl ------------------------------------------------------------------------------ == Perform complete crawling and indexing given a set of root urls. == + '''Configuration Files Used:''' + hadoop-default.xml[[BR]] + hadoop-site.xml[[BR]] + crawl-tool.xml[[BR]] + '''Usage:''' nutch-0.8-dev/bin/nutch org.apache.nutch.crawl.Crawl <urlDir> [-dir d] [-threads n] [-depth i] [-topN] - '''<urlDir>:''' contains text files with URL lists. This must be an existing directory. + '''<urlDir>:''' contains text files with URL lists. This must be an existing directory. Default Value: ''None'' + '''[-dir <d>]:''' The directory where Nutch will save the crawl files. Default Value: ''./crawl-[date]'' where [date] is the current date. - '''[-dir d]:''' You can choose the directory, where Nutch should save the index. - If you donât choose a directory Nutch would create a own directory in the directory where you started the crawl. - Example of a âdir parameter: -dir /usr/local/index/ - '''[-threads n]:''' ''<need description>'' + '''[-threads <n>]:''' Number of Fetcher Threads to use. Overrides the configuration key ''fetcher.threads.fetch''. Default Value: ''10'' + '''[-depth <i>]:''' Number of iterations Nutch should crawl. Default Value: ''5'' - '''[-depth i]:''' You can tell Nutch how deep it should crawl. If you donât tell Nutch a value, it takes 3 as his standard parameter. - For example if you say âdepth 1, Nutch would only index the first level. Only if you say âdepth 2 (or more) Nutch would make a link follow. - '''[-topN]:''' ''<need description>'' + '''[-topN <num>]:''' Limit crawls to the top <num> links per iteration. Default Value: ''Integer.MAX_VALUE'' DevelopmentCommandLineOptions ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Nutch-cvs mailing list Nutch-cvs@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-cvs