On Jun 6, 2012, at 03:42 , Harsh J wrote:

>> I think mapred.tasktracker.map.tasks.maximum sets the number of map
> tasks and not slots.
> 
> This is incorrect. The property does configure slots. Please also see
> http://wiki.apache.org/hadoop/HowManyMapsAndReduces and
> http://wiki.apache.org/hadoop/FAQ#I_see_a_maximum_of_2_maps.2BAC8-reduces_spawned_concurrently_on_each_TaskTracker.2C_how_do_I_increase_that.3F
> for more.


But Harsh, wouldn't you agree that the first reference you provided above is 
talking about the number of tasks spawned for a given job at job-runtime and 
not the number of slots hard-configured into the cluster at cluster-spinup time?

Incidentally, the second reference above is partially broken.  It attempts to 
offer links to dig into further detail about 
mapred.tasktracker.map.tasks.maximum and 
mapred.tasktracker.reduce.tasks.maximum, but the links are broken.  For 
example, one of the two broken links is:

http://hadoop.apache.org/common/docs/current/hadoop-default.html#mapred.tasktracker.map.tasks.maximum

It's still broken even if you remove the anchor from the end of the URL, which 
is to say the hadoop-default.html webpage doesn't even exist.

In fact, it is difficult find any official documentation on those properties 
(Google searches for the terms do not provide links to any proper documentation 
within apache, but rather just lots of back and forth forum discussions about 
the properties).  One thing I did find was a claim that those properties are 
deprecated in 2.0.0:

http://hadoop.apache.org/common/docs/current/hadoop-project-dist/hadoop-common/DeprecatedProperties.html

That page indicates that they were replaced with equivalents in which the first 
component is now 'mapreduce', not 'mapred'.  Even with the new terms however, 
Google still doesn't link to any formal documentation describing those 
properties.  In fact, I have yet to find a webpage anywhere which officially 
states the purpose/effect of mapred(uce).tasktracker.map.tasks.maximum.

That said, I agree that the consensus of discussion and description seems to 
imply that these properties have a cluster-level (not job-level) effect on the 
number of map/reduce slots on the cluster, not the number of tasks spawned for 
a given job.  Such a concept obviously convolutes the intuition that slots 
correspond to cores as I suggested in an earlier post and I apologize for that. 
 

________________________________________________________________________________
Keith Wiley     kwi...@keithwiley.com     keithwiley.com    music.keithwiley.com

"Yet mark his perfect self-contentment, and hence learn his lesson, that to be
self-contented is to be vile and ignorant, and that to aspire is better than to
be blindly and impotently happy."
                                           --  Edwin A. Abbott, Flatland
________________________________________________________________________________

Reply via email to