[ https://issues.apache.org/jira/browse/HADOOP-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinod K V updated HADOOP-5884: ------------------------------ Attachment: HADOOP-5884-20090529.1.txt The proposal is to track capacities and user-limits by the number of slots occupied by the tasks of a job instead of the number of running tasks. Attaching patch implementing this. This patch has to be applied over the latest patch for HADOOP-5932. This patch does the following: - Modifies all the calculations of capacities and user-limits to be based on the number of slots occupied by running tasks of a job. - Retains number of running tasks for displaying on the UI - Adds test-cases to verify the number of slots accounted for high memory jobs by modifying the corresponding tests. - Adds test-cases to verify the newly added "occupied slots" in the scheduling information - Adds missing @override tags, removes stale imports and stale occurrences of gc (guarenteed capacity) > Capacity scheduler should account high memory jobs as using more capacity of > the queue > -------------------------------------------------------------------------------------- > > Key: HADOOP-5884 > URL: https://issues.apache.org/jira/browse/HADOOP-5884 > Project: Hadoop Core > Issue Type: Bug > Components: contrib/capacity-sched > Reporter: Hemanth Yamijala > Assignee: Vinod K V > Attachments: HADOOP-5884-20090529.1.txt > > > Currently, when a high memory job is scheduled by the capacity scheduler, > each task scheduled counts only once in the capacity of the queue, though it > may actually be preventing other jobs from using spare slots on that node > because of its higher memory requirements. In order to be fair, the capacity > scheduler should proportionally (with respect to default memory) account high > memory jobs as using a larger capacity of the queue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.