Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The "bin/nutch_generate" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/bin/nutch_generate?action=diff&rev1=8&rev2=9

  
  This class generates a subset of a crawl db to fetch. This version allows us 
to generate fetchlists for several segments in one go. Unlike in the initial 
version (FetchListTool), the IP resolution is done ONLY on the entries which 
have been selected for fetching. The URLs are partitioned by IP, domain or host 
within a segment. We can chose separately how to count the URLS i.e. by domain 
or host to limit the entries.
  
+ {{{
  Usage: bin/nutch org.apache.nutch.crawl.Generator <crawldb> <segments_dir> 
[-force] [-topN N] [-numFetchers numFetchers] [-adddays numDays] [-noFilter] 
[-noNorm][-maxNumSegments num]
+ }}}
  
  '''<crawldb>''': Path to the location of our crawldb directory.
  

Reply via email to