Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The "bin/nutch_generate" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/bin/nutch_generate?action=diff&rev1=12&rev2=13

  
  '''[-topN N]''': Where N is the number of top URLs to be selected. Normally, 
the "generate" command prepares a fetchlist out of all unfetched pages, or the 
ones where fetch interval already expired. But if you use -topN, then instead 
of all unfetched urls you only get N urls with the highest score - potentially 
the most interesting ones, which should be prioritized in fetching.
  
- '''[-numFetchers numFetchers]''': The number of fetch partitions. Default: 
Configuration key -> mapred.map.tasks -> 1
+ '''[-numFetchers numFetchers]''': The number of fetch partitions. Default: 
Configuration key -> mapred.map.tasks -> 1 (in local mode), possibly multiple 
in deploy/distributed mode.
  
  '''[-adddays numDays]''': Adds <days> to the current time to facilitate 
crawling urls already fetched sooner then db.default.fetch.interval. Default: 0
  

Reply via email to