[ 
https://issues.apache.org/jira/browse/TEZ-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14203629#comment-14203629
 ] 

Siddharth Seth commented on TEZ-1750:
-------------------------------------

Thanks for the reviews. Committing.

bq. This will likely need a bunch of experimentation to see benefits/issues. 
That will give more data on this approach. Until important points like 
different edge types etc. are addressed this probably should not be made 
default.
We'll get to this in 0.6. The issue where we end up doing out of order 
scheduling is a bigger one in terms of performance. Pre-emption helps but 
performance would still suffer when multiple DAGs per session come into play.

> Add a DAGScheduler which schedules tasks only when sources have been scheduled
> ------------------------------------------------------------------------------
>
>                 Key: TEZ-1750
>                 URL: https://issues.apache.org/jira/browse/TEZ-1750
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>            Priority: Critical
>         Attachments: TEZ-1750.1.txt, TEZ-1750.2.txt, TEZ-1750.3.txt
>
>
> Splitting out the patch on TEZ-1522 into a separate jira.
> There's several scenarios in which we end up scheduling downstream tasks 
> before their sources have been scheduled - and then get into a situation 
> where the sources are starved. Currently, anywhere a ShuffleVertexManager is 
> used can cause such behaviour - since it starts scheduling it's tasks after a 
> certain number of sources are complete, but subsequen non-shuffle 
> VertexManagers will scheduled immediately.
> Disabling slow-start is one option to achieve this (or setting slow start on 
> all vertices), but it doesn't work for the situation where dynamic reducer 
> parallelism kicks in - since it has to wait for source tasks to complete.
> The intent here is to add a DAGScheduler, which affectively negates the slow 
> start, and in case of dynamic parallelism determination, waits for upstream 
> tasks to be scheduled before scheduling downstream tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to