[
https://issues.apache.org/jira/browse/HADOOP-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinod K V updated HADOOP-5884:
------------------------------
Attachment: HADOOP-5884-20090529.1.txt
The proposal is to track capacities and user-limits by the number of slots
occupied by the tasks of a job instead of the number of running tasks.
Attaching patch implementing this. This patch has to be applied over the latest
patch for HADOOP-5932. This patch does the following:
- Modifies all the calculations of capacities and user-limits to be based on
the number of slots occupied by running tasks of a job.
- Retains number of running tasks for displaying on the UI
- Adds test-cases to verify the number of slots accounted for high memory jobs
by modifying the corresponding tests.
- Adds test-cases to verify the newly added "occupied slots" in the scheduling
information
- Adds missing @override tags, removes stale imports and stale occurrences of
gc (guarenteed capacity)
> Capacity scheduler should account high memory jobs as using more capacity of
> the queue
> --------------------------------------------------------------------------------------
>
> Key: HADOOP-5884
> URL: https://issues.apache.org/jira/browse/HADOOP-5884
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/capacity-sched
> Reporter: Hemanth Yamijala
> Assignee: Vinod K V
> Attachments: HADOOP-5884-20090529.1.txt
>
>
> Currently, when a high memory job is scheduled by the capacity scheduler,
> each task scheduled counts only once in the capacity of the queue, though it
> may actually be preventing other jobs from using spare slots on that node
> because of its higher memory requirements. In order to be fair, the capacity
> scheduler should proportionally (with respect to default memory) account high
> memory jobs as using a larger capacity of the queue.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.