[ 
https://issues.apache.org/jira/browse/TEZ-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-1750:
--------------------------------
    Attachment: TEZ-1750.1.txt

Patch to do what's mentioned in the description.

The patch is written in a way to make this fully pluggable - without affecting 
the existing source. It could be more efficient (and cleaner) with some 
additional events being generated - but I'm trying to keep the changes limited 
since this is targeted for 0.5.3, which is a patch release over 0.5.2.

[~bikassaha], [~rajesh.balamohan] - please review.

> Add a DAGScheduler which schedules tasks only when sources have been scheduled
> ------------------------------------------------------------------------------
>
>                 Key: TEZ-1750
>                 URL: https://issues.apache.org/jira/browse/TEZ-1750
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>            Priority: Critical
>         Attachments: TEZ-1750.1.txt
>
>
> Splitting out the patch on TEZ-1522 into a separate jira.
> There's several scenarios in which we end up scheduling downstream tasks 
> before their sources have been scheduled - and then get into a situation 
> where the sources are starved. Currently, anywhere a ShuffleVertexManager is 
> used can cause such behaviour - since it starts scheduling it's tasks after a 
> certain number of sources are complete, but subsequen non-shuffle 
> VertexManagers will scheduled immediately.
> Disabling slow-start is one option to achieve this (or setting slow start on 
> all vertices), but it doesn't work for the situation where dynamic reducer 
> parallelism kicks in - since it has to wait for source tasks to complete.
> The intent here is to add a DAGScheduler, which affectively negates the slow 
> start, and in case of dynamic parallelism determination, waits for upstream 
> tasks to be scheduled before scheduling downstream tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to