[
https://issues.apache.org/jira/browse/TEZ-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286573#comment-16286573
]
Eric Wohlstadter commented on TEZ-394:
--------------------------------------
[~jlowe]
Ok, I didn't understand we were using children recursively. Thanks for the
explanation.
> Better scheduling for uneven DAGs
> ---------------------------------
>
> Key: TEZ-394
> URL: https://issues.apache.org/jira/browse/TEZ-394
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Rohini Palaniswamy
> Assignee: Jason Lowe
> Attachments: TEZ-394.001.patch, TEZ-394.002.patch, TEZ-394.003.patch
>
>
> Consider a series of joins or group by on dataset A with few datasets that
> takes 10 hours followed by a final join with a dataset X. The vertex that
> loads dataset X will be one of the top vertexes and initialized early even
> though its output is not consumed till the end after 10 hours.
> 1) Could either use delayed start logic for better resource allocation
> 2) Else if they are started upfront, need to handle failure/recovery cases
> where the nodes which executed the MapTask might have gone down when the
> final join happens.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)