[
https://issues.apache.org/jira/browse/HADOOP-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714371#action_12714371
]
Devaraj Das commented on HADOOP-5170:
-------------------------------------
I have a minor nit - the code in JobInProgress.findNewMapTask/findNewReduceTask
that this patch adds is very similar and probably can be factored out to a
separate method with the appropriate args.
Other than that, in the testcase, there are big waits (and the testcase takes
~3 minutes to run). Are they required to be so long. Also, in general, we
should move to the model of spoofing heartbeats (and faking other objects) in
such testcases but I won't hold this patch up for that (unless there is
enthusiasm to modify the test in that direction).
> Set max map/reduce tasks on a per-job basis, either per-node or cluster-wide
> ----------------------------------------------------------------------------
>
> Key: HADOOP-5170
> URL: https://issues.apache.org/jira/browse/HADOOP-5170
> Project: Hadoop Core
> Issue Type: New Feature
> Components: mapred
> Reporter: Jonathan Gray
> Assignee: Matei Zaharia
> Attachments: HADOOP-5170-tasklimits-v3-0.18.3.patch,
> tasklimits-v2.patch, tasklimits-v3-0.19.patch, tasklimits-v3.patch,
> tasklimits.patch
>
>
> There are a number of use cases for being able to do this. The focus of this
> jira should be on finding what would be the simplest to implement that would
> satisfy the most use cases.
> This could be implemented as either a per-node maximum or a cluster-wide
> maximum. It seems that for most uses, the former is preferable however
> either would fulfill the requirements of this jira.
> Some of the reasons for allowing this feature (mine and from others on list):
> - I have some very large CPU-bound jobs. I am forced to keep the max
> map/node limit at 2 or 3 (on a 4 core node) so that I do not starve the
> Datanode and Regionserver. I have other jobs that are network latency bound
> and would like to be able to run high numbers of them concurrently on each
> node. Though I can thread some jobs, there are some use cases that are
> difficult to thread (scanning from hbase) and there's significant complexity
> added to the job rather than letting hadoop handle the concurrency.
> - Poor assignment of tasks to nodes creates some situations where you have
> multiple reducers on a single node but other nodes that received none. A
> limit of 1 reducer per node for that job would prevent that from happening.
> (only works with per-node limit)
> - Poor mans MR job virtualization. Since we can limit a jobs resources, this
> gives much more control in allocating and dividing up resources of a large
> cluster. (makes most sense w/ cluster-wide limit)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.