[ 
https://issues.apache.org/jira/browse/TEZ-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865631#comment-15865631
 ] 

Bikas Saha commented on TEZ-394:
--------------------------------

Thanks for doing this! I regret not having done this right from the start. 
Mostly looks good to me.

The name of the assigned variable is now misleading because its not topo sorted 
anymore.
{code}+    topologicalVertexStack = 
reorderForCriticalPath(topologicalVertexStack,
+        vertexMap, inboundVertexMap, outboundVertexMap);{code}


[~gopalv] Would this break any assumptions in Hive?

> Better scheduling for uneven DAGs
> ---------------------------------
>
>                 Key: TEZ-394
>                 URL: https://issues.apache.org/jira/browse/TEZ-394
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Rohini Palaniswamy
>            Assignee: Jason Lowe
>         Attachments: TEZ-394.001.patch
>
>
>   Consider a series of joins or group by on dataset A with few datasets that 
> takes 10 hours followed by a final join with a dataset X. The vertex that 
> loads dataset X will be one of the top vertexes and initialized early even 
> though its output is not consumed till the end after 10 hours. 
> 1) Could either use delayed start logic for better resource allocation
> 2) Else if they are started upfront, need to handle failure/recovery cases 
> where the nodes which executed the MapTask might have gone down when the 
> final join happens. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to