Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by JeffRitchie: http://wiki.apache.org/nutch/nutch-0%2e8-dev/bin/nutch_crawl ------------------------------------------------------------------------------ == Perform complete crawling and indexing given a set of root urls. == - '''Configuration Files Used:''' + === Usage === + nutch-0.8-dev/bin/nutch org.apache.nutch.crawl.Crawl <urlDir> [-dir d] [-threads n] [-depth i] [-topN] + + '''<urlDir>:''' contains text files with URL lists. This must be an existing directory. Default Value: ''None''[[BR]] + '''[-dir <d>]:''' The directory where Nutch will save the crawl files. Default Value: ''./crawl-[date]'' where [date] is the current date.[[BR]] + '''[-threads <n>]:''' Number of Fetcher Threads to use. Overrides the configuration key ''fetcher.threads.fetch''. Default Value: ''10''[[BR]] + '''[-depth <i>]:''' Number of iterations Nutch should crawl. Default Value: ''5''[[BR]] + '''[-topN <num>]:''' Limit crawls to the top <num> links per iteration. Default Value: ''Integer.MAX_VALUE''[[BR]] + + === Configuration Files === hadoop-default.xml[[BR]] hadoop-site.xml[[BR]] crawl-tool.xml[[BR]] - '''Usage:''' nutch-0.8-dev/bin/nutch org.apache.nutch.crawl.Crawl <urlDir> [-dir d] [-threads n] [-depth i] [-topN] + === Other Files === + crawl-urlfilter.txt + === Caveats and Notes === + None. - '''<urlDir>:''' contains text files with URL lists. This must be an existing directory. Default Value: ''None'' - - '''[-dir <d>]:''' The directory where Nutch will save the crawl files. Default Value: ''./crawl-[date]'' where [date] is the current date. - - '''[-threads <n>]:''' Number of Fetcher Threads to use. Overrides the configuration key ''fetcher.threads.fetch''. Default Value: ''10'' - - '''[-depth <i>]:''' Number of iterations Nutch should crawl. Default Value: ''5'' - - '''[-topN <num>]:''' Limit crawls to the top <num> links per iteration. Default Value: ''Integer.MAX_VALUE'' DevelopmentCommandLineOptions ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Nutch-cvs mailing list Nutch-cvs@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-cvs