[ 
https://issues.apache.org/jira/browse/HADOOP-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634751#action_12634751
 ] 

eric baldeschwieler commented on HADOOP-3136:
---------------------------------------------

Hi Matei,

Don't let me discourage you from getting some experimental data on  
this!  I'm sure we can find some simple heuristics that are better  
than hat we have now.  Even on a loaded cluster, the right decision  
per node may require more global information.  For example, is it a  
good strategy to run all the tasks for the first job in the queue on a  
subset of the nodes in the cluster?  Maybe it is.  But you probably  
want to be thinking a bit about what the shuffle phase is going to  
look like.  You may not want very unbalanced racks / nodes.  Or in  
some cases perhaps that unbalance is optimal!

If a job only runs on 1/5 of the nodes on a cluster, you may amortize  
a lot of startup costs that you pay per node over more tasks / node.   
This might be really good!  Costs to copy jars and start VMs for  
example can be significant.

If you have 1000 maps to run on 1000 nodes, your might like to fill  
every map slot on a subset of the nodes with at least 2 generations of  
maps, rather than running one map / node.  This could give you much  
better throughput.  This would support your idea.

On the other hand, you need to make sure you keep your node and rack  
locality high, or that strategy might work really badly.  One could  
imagine your total locality going down a lot for your tail jobs, as we  
just observed.  Also, you might create shuffle hotspots, which could  
kill your HDFS performance and slow down total throughput...

E14


> Assign multiple tasks per TaskTracker heartbeat
> -----------------------------------------------
>
>                 Key: HADOOP-3136
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3136
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Arun C Murthy
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-3136_0_20080805.patch, 
> HADOOP-3136_1_20080809.patch, HADOOP-3136_2_20080911.patch
>
>
> In today's logic of finding a new task, we assign only one task per heartbeat.
> We probably could give the tasktracker multiple tasks subject to the max 
> number of free slots it has - for maps we could assign it data local tasks. 
> We could probably run some logic to decide what to give it if we run out of 
> data local tasks (e.g., tasks from overloaded racks, tasks that have least 
> locality, etc.). In addition to maps, if it has reduce slots free, we could 
> give it reduce task(s) as well. Again for reduces we could probably run some 
> logic to give more tasks to nodes that are closer to nodes running most maps 
> (assuming data generated is proportional to the number of maps). For e.g., if 
> rack1 has 70% of the input splits, and we know that most maps are data/rack 
> local, we try to schedule ~70% of the reducers there.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to