[ 
https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635659#action_12635659
 ] 

dhruba borthakur commented on HADOOP-4035:
------------------------------------------

>A reasonable assumption to make while computing used capacity is to assume 
>that for all TTs in a cluster, the amount of memory per slot is configured to 
>be the same value

I am a little confused about the above statement. It is possible to have two 
different types of machine in the same cluster.... the only difference being 
the amount if memory on these types. Since the CPU capacity is the same, I 
would ideally configure both types of machines to have the same number of 
slots. However, the memory capacity per slot on one type of machine would be 
larger than the memory capacity per slot of the other type of machine. It would 
be nice if the JT/TT can compute the memory capacity per slot and then schedule 
tasks accordingly.

Also, the JT scheduler can generate more affinity of reduce tasks to slots with 
larger memory-capacity-per-slot because reduce tasks possibly take more memory 
than map tasks.

> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory 
> requirements and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt
>
>
> HADOOP-3759 introduced configuration variables that can be used to specify 
> memory requirements for jobs, and also modified the tasktrackers to report 
> their free memory. The capacity scheduler in HADOOP-3445 should schedule 
> tasks based on these parameters. A task that is scheduled on a TT that uses 
> more than the default amount of memory per slot can be viewed as effectively 
> using more than one slot, as it would decrease the amount of free memory on 
> the TT by more than the default amount while it runs. The scheduler should 
> make the used capacity account for this additional usage while enforcing 
> limits, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to