Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The "bin/nutch_generate" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/bin/nutch_generate?action=diff&rev1=10&rev2=11 This class generates a subset of a crawl db to fetch. This version allows us to generate fetchlists for several segments in one go. Unlike in the initial version (FetchListTool), the IP resolution is done ONLY on the entries which have been selected for fetching. The URLs are partitioned by IP, domain or host within a segment. We can chose separately how to count the URLS i.e. by domain or host to limit the entries. {{{ - Usage: bin/nutch org.apache.nutch.crawl.Generator <crawldb> <segments_dir> [-force] [-topN N] [-numFetchers numFetchers] [-adddays numDays] [-noFilter] [-noNorm][-maxNumSegments num] + Usage: bin/nutch generate <crawldb> <segments_dir> [-force] [-topN N] [-numFetchers numFetchers] [-adddays numDays] [-noFilter] [-noNorm][-maxNumSegments num] }}} '''<crawldb>''': Path to the location of our crawldb directory.

