Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.

The following page has been changed by LohitVijayarenu:
http://wiki.apache.org/lucene-hadoop/HowManyMapsAndReduces

------------------------------------------------------------------------------
  
  == Number of Reduces ==
  
- The right number of reduces seems 0.95 or 1.75 * (nodes * 
mapred.tasktracker.tasks.maximum). At 0.95 all of the reduces can launch 
immediately and start transfering map outputs as the maps finish. At 1.75 the 
faster nodes will finish their first round of reduces and launch a second round 
of reduces doing a much better job of load balancing.
+ The right number of reduces seems to be 0.95 or 1.75 * (nodes * 
mapred.tasktracker.tasks.maximum). At 0.95 all of the reduces can launch 
immediately and start transfering map outputs as the maps finish. At 1.75 the 
faster nodes will finish their first round of reduces and launch a second round 
of reduces doing a much better job of load balancing.
  
  Currently the number of reduces is limited to roughly 1000 by the buffer size 
for the output files (io.buffer.size * 2 * numReduces << heapSize). This will 
be fixed at some point, but until it is it provides a pretty firm upper bound.
  

Reply via email to