[ https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635659#action_12635659 ]
dhruba borthakur commented on HADOOP-4035: ------------------------------------------ >A reasonable assumption to make while computing used capacity is to assume >that for all TTs in a cluster, the amount of memory per slot is configured to >be the same value I am a little confused about the above statement. It is possible to have two different types of machine in the same cluster.... the only difference being the amount if memory on these types. Since the CPU capacity is the same, I would ideally configure both types of machines to have the same number of slots. However, the memory capacity per slot on one type of machine would be larger than the memory capacity per slot of the other type of machine. It would be nice if the JT/TT can compute the memory capacity per slot and then schedule tasks accordingly. Also, the JT scheduler can generate more affinity of reduce tasks to slots with larger memory-capacity-per-slot because reduce tasks possibly take more memory than map tasks. > Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory > requirements and task trackers free memory > ------------------------------------------------------------------------------------------------------------------------ > > Key: HADOOP-4035 > URL: https://issues.apache.org/jira/browse/HADOOP-4035 > Project: Hadoop Core > Issue Type: Bug > Components: contrib/capacity-sched > Affects Versions: 0.19.0 > Reporter: Hemanth Yamijala > Assignee: Vinod K V > Priority: Blocker > Fix For: 0.19.0 > > Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt > > > HADOOP-3759 introduced configuration variables that can be used to specify > memory requirements for jobs, and also modified the tasktrackers to report > their free memory. The capacity scheduler in HADOOP-3445 should schedule > tasks based on these parameters. A task that is scheduled on a TT that uses > more than the default amount of memory per slot can be viewed as effectively > using more than one slot, as it would decrease the amount of free memory on > the TT by more than the default amount while it runs. The scheduler should > make the used capacity account for this additional usage while enforcing > limits, etc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.