Recover the deprecated mapred.tasktracker.tasks.maximum
-------------------------------------------------------

                 Key: HADOOP-3420
                 URL: https://issues.apache.org/jira/browse/HADOOP-3420
             Project: Hadoop Core
          Issue Type: Improvement
          Components: conf
    Affects Versions: 0.16.4, 0.16.3, 0.16.2, 0.16.1, 0.16.0
            Reporter: Iván de Prado


https://issues.apache.org/jira/browse/HADOOP-1274 replaced the configuration 
attribute mapred.tasktracker.tasks.maximum with 
mapred.tasktracker.map.tasks.maximum and 
mapred.tasktracker.reduce.tasks.maximum because it sometimes make sense to have 
more mappers than reducers assigned to each node.

But deprecating mapred.tasktracker.tasks.maximum could be an issue in some 
situations. For example, when more than one job is running, reduce tasks + map 
tasks eat too many resources. For avoid this cases an upper limit of tasks is 
needed. So I propose to have the configuration parameter 
mapred.tasktracker.tasks.maximum as a total limit of task. It is compatible 
with mapred.tasktracker.map.tasks.maximum and 
mapred.tasktracker.reduce.tasks.maximum.

As an example:

I have a 8 cores, 4GB, 4 nodes cluster. I want to limit the number of tasks per 
node to 8. 8 tasks per nodes would use almost 100% cpu and 4 GB of the memory. 
I have set:
  mapred.tasktracker.map.tasks.maximum -> 8
  mapred.tasktracker.reduce.tasks.maximum -> 8 

1) When running only one Job at the same time, it works smoothly: 8 task 
average per node, no swapping in nodes, almost 4 GB of memory usage and 100% of 
CPU usage. 

2) When running more than one Job at the same time, it works really bad: 16 
tasks average per node, 8 GB usage of memory (4 GB swapped), and a lot of 
System CPU usage.

So, I think that have sense to restore the old attribute 
mapred.tasktracker.tasks.maximum making it compatible with the new ones.

Task trackers could not:
 - run more than mapred.tasktracker.tasks.maximum tasks per node,
 - run more than mapred.tasktracker.map.tasks.maximum mappers per node, 
 - run more than mapred.tasktracker.reduce.tasks.maximum reducers per node. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to