[jira] Commented: (HADOOP-5884) Capacity scheduler should account high memory jobs as using more capacity of the queue

Hemanth Yamijala (JIRA) Mon, 01 Jun 2009 08:57:09 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715131#action_12715131
 ]


Hemanth Yamijala commented on HADOOP-5884:
------------------------------------------

Some comments:

- TaskSchedulingInfo.toString() - displaying the actual value had some problem 
in terms of exactness and mismatch between cluster info and the state we kept. 
That's why we shifted to percentages. May be a good idea to retain the model. 
Same argument can be made for running tasks and numSlotsOccupiedByThisUser
- "Occupied slots" seems too techie. Call it 'Used capacity' ? Likewise instead 
of '% of total slots occupied by all users', call it '% of used capacity' ?
- TaskSchedulingMgr.isUserOverLimit() - we add 1 if we're using more than the 
queue capacity. It could be more than 1, depending on the task we are assigning 
(if it's part of high RAM job)
- MapSchedulingMgr constructor: typo: schedulr - should be scheduler. Similar 
for Reduce...
- Minor NIT: Use format instead of the complicated StringBuffer.append()... 
kind of code. Makes it really hard to find what's happening.
- updateQSIObjects. The log statement is printing numMapSlotsForThisJob instead 
of numMapsRunningForThisJob.

> Capacity scheduler should account high memory jobs as using more capacity of 
> the queue
> --------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5884
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>         Attachments: HADOOP-5884-20090529.1.txt
>
>
> Currently, when a high memory job is scheduled by the capacity scheduler, 
> each task scheduled counts only once in the capacity of the queue, though it 
> may actually be preventing other jobs from using spare slots on that node 
> because of its higher memory requirements. In order to be fair, the capacity 
> scheduler should proportionally (with respect to default memory) account high 
> memory jobs as using a larger capacity of the queue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5884) Capacity scheduler should account high memory jobs as using more capacity of the queue

Reply via email to