[
https://issues.apache.org/jira/browse/MAPREDUCE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated MAPREDUCE-5583:
----------------------------------
Attachment: MAPREDUCE-5583v1.patch
Had an offline discussion about this with Arun, and he suggested using the ANY
ask (i.e.: host="*") to act as a limit to the request. YARN only schedules
containers for an application as long as the ANY ask is non-zero, so sending a
request for 100 hosts and 10 racks but an ANY ask of 1 will only return 1
container. If the AM carefully modulates the ANY ask then it can self-limit
without needing to give up telling the RM about all of its locality desires.
Attaching a patch that implements this approach. It needs unit tests, but I've
manually tested it and maps and reduces are being limited, accordingly. The
mapreduce.job.running.maps.limit and mapreduce.job.running.reduces.limit
properties control it, where 0 (the default) means no limit otherwise it
specifies the number of maps or reduces, respectively, that will be allowed to
run concurrently.
Feedback appreciated.
> Ability to limit running map and reduce tasks
> ---------------------------------------------
>
> Key: MAPREDUCE-5583
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5583
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mr-am, mrv2
> Affects Versions: 0.23.9, 2.1.1-beta
> Reporter: Jason Lowe
> Attachments: MAPREDUCE-5583v1.patch
>
>
> It would be nice if users could specify a limit to the number of map or
> reduce tasks that are running simultaneously. Occasionally users are
> performing operations in tasks that can lead to DDoS scenarios if too many
> tasks run simultaneously (e.g.: accessing a database, web service, etc.).
> Having the ability to throttle the number of tasks simultaneously running
> would provide users a way to mitigate issues with too many tasks on a large
> cluster attempting to access a serivce at any one time.
> This is similar to the functionality requested by MAPREDUCE-224 and
> implemented by HADOOP-3412 but was dropped in mrv2.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)