[jira] Updated: (HADOOP-5884) Capacity scheduler should account high memory jobs as using more capacity of the queue

Vinod K V (JIRA) Tue, 02 Jun 2009 04:52:35 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vinod K V updated HADOOP-5884:
------------------------------

    Attachment: HADOOP-5884-20090602.1.txt

Updated patch incorporating all the above review comments except one:
 - Removed running tasks information from the UI. As of now, we are trying to 
avoid absolute numbers because of possible inconsistency between scheduler's 
information and cluster status. And, specifying running tasks as a percentage 
of total cluster capacity doesn't make sense now with each task possibly 
occupying multiple slots. The correct fix is to print absolute numbers after 
removing any inconsisteny possible. Hence pushing this to another follow-up 
jira issue.

@Arun
bq. Can we also add the number of slots to the UI?
I didn't get this. Do you mean number of slots per job being displayed in 
job-scheduling information? We are already displaying the number of slots used 
by a queue as percentage.

If you meant the first, I already considered this, but let it go for another 
jira. The job scheduling information is being displayed on the jobtracker ui 
first page and it looked ugly when it spanned multiple lines. I think it would 
be good if we can remove job scheduling information from the first page. But as 
that might trigger discussion, I've decided to leave it for now.

bq.Long term - we really should fix TestCapacityScheduler to not check strings 
and use relevant apis (even package-private ones).
Agree, even I could realize the pain while modifying testcases, but decide to 
postpone it for another jira as it is slightly tricky.


> Capacity scheduler should account high memory jobs as using more capacity of 
> the queue
> --------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5884
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>         Attachments: HADOOP-5884-20090529.1.txt, HADOOP-5884-20090602.1.txt
>
>
> Currently, when a high memory job is scheduled by the capacity scheduler, 
> each task scheduled counts only once in the capacity of the queue, though it 
> may actually be preventing other jobs from using spare slots on that node 
> because of its higher memory requirements. In order to be fair, the capacity 
> scheduler should proportionally (with respect to default memory) account high 
> memory jobs as using a larger capacity of the queue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5884) Capacity scheduler should account high memory jobs as using more capacity of the queue

Reply via email to