[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13795228#comment-13795228
 ] 

Jason Lowe commented on MAPREDUCE-5583:
---------------------------------------

Not in a general way.  Different jobs can have different limits, and queue/user 
limits are too granular a tool to handle that appropriately.  We're either 
creating a ton of queues for the various scenarios which is a huge pain from 
the usability and admin point of view, or we're artificially constricting jobs 
that don't have a need for those limits that happen to run in a queue that was 
shrunk for other jobs.  For example, take the case where we need to increase 
the memory for map tasks.  If we took the use-the-queue-as-the-limit route we 
now have less tasks running simultaneously than we did before which is 
undesirable, and the queue needs to be changed each time the job grows or 
shrinks.  If we could limit it per-job in the AM it would have run with the 
appropriate parallelism, assuming the original queue had the capacity.

Having per-job limits allows the user to tune their jobs in a much more 
intuitive way and without requiring admins to assist in that tuning.

> Ability to limit running map and reduce tasks
> ---------------------------------------------
>
>                 Key: MAPREDUCE-5583
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5583
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am, mrv2
>    Affects Versions: 0.23.9, 2.1.1-beta
>            Reporter: Jason Lowe
>
> It would be nice if users could specify a limit to the number of map or 
> reduce tasks that are running simultaneously.  Occasionally users are 
> performing operations in tasks that can lead to DDoS scenarios if too many 
> tasks run simultaneously (e.g.: accessing a database, web service, etc.).  
> Having the ability to throttle the number of tasks simultaneously running 
> would provide users a way to mitigate issues with too many tasks on a large 
> cluster attempting to access a serivce at any one time.
> This is similar to the functionality requested by MAPREDUCE-224 and 
> implemented by HADOOP-3412 but was dropped in mrv2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to