[ https://issues.apache.org/jira/browse/HADOOP-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634398#action_12634398 ]
Matei Zaharia commented on HADOOP-3136: --------------------------------------- Eric, regarding the cluster overloading, can't we limit the number of slots we use per node when the cluster is undersubscribed, the same way the current one-task-at-a-time scheduler does? As far as I understand, the current scheduler limits the number of tasks it runs on a node to min (number of slots per node, (total number of tasks) / (total number of slots)). For example, if you have a cluster with 50 machines with 4 slots each (i.e. 200 slots in total), and you submit a job with only 100 maps, it will only launch up to 2 maps on each node. > Assign multiple tasks per TaskTracker heartbeat > ----------------------------------------------- > > Key: HADOOP-3136 > URL: https://issues.apache.org/jira/browse/HADOOP-3136 > Project: Hadoop Core > Issue Type: Improvement > Components: mapred > Reporter: Devaraj Das > Assignee: Arun C Murthy > Fix For: 0.20.0 > > Attachments: HADOOP-3136_0_20080805.patch, > HADOOP-3136_1_20080809.patch, HADOOP-3136_2_20080911.patch > > > In today's logic of finding a new task, we assign only one task per heartbeat. > We probably could give the tasktracker multiple tasks subject to the max > number of free slots it has - for maps we could assign it data local tasks. > We could probably run some logic to decide what to give it if we run out of > data local tasks (e.g., tasks from overloaded racks, tasks that have least > locality, etc.). In addition to maps, if it has reduce slots free, we could > give it reduce task(s) as well. Again for reduces we could probably run some > logic to give more tasks to nodes that are closer to nodes running most maps > (assuming data generated is proportional to the number of maps). For e.g., if > rack1 has 70% of the input splits, and we know that most maps are data/rack > local, we try to schedule ~70% of the reducers there. > Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.