[ 
https://issues.apache.org/jira/browse/FLINK-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14805292#comment-14805292
 ] 

ASF GitHub Bot commented on FLINK-2119:
---------------------------------------

Github user uce closed the pull request at:

    https://github.com/apache/flink/pull/754


> Add ExecutionGraph support for leg-wise scheduling
> --------------------------------------------------
>
>                 Key: FLINK-2119
>                 URL: https://issues.apache.org/jira/browse/FLINK-2119
>             Project: Flink
>          Issue Type: Improvement
>          Components: Scheduler
>    Affects Versions: master
>            Reporter: Ufuk Celebi
>            Assignee: Ufuk Celebi
>
> Scheduling currently happens by lazily unrolling the ExecutionGraph from the 
> sources.
> 1. All sources are scheduled for execution.
> 2. Their results trigger scheduling and deployment of the receiving tasks 
> (either on the first available buffer or when all are produced [pipelined vs. 
> blocking exchange]).
> For certain batch jobs this can be problematic as many tasks will be running 
> at the same time and consume task manager resources like executionslots and 
> memory. For these jobs, it is desirable to schedule the ExecutionGraph in 
> with different strategies.
> With respect to the ExecutionGraph, the current limitation is that data 
> availability for a result always triggers scheduling of the consuming tasks. 
> This needs to be more general to allow different scheduling strategies.
> Consider the following example:
> {code}
>           [ union ]
>          /         \
>         /           \
>   [ source 1 ]  [ source 2 ]
> {code}
> Currently, both sources are scheduled concurrently and the "faster" one 
> triggers scheduling of the union. It is desirable to first allow source 1 to 
> completly produce its result, then trigger scheduling of source 2, and only 
> then schedule the union.
> The required changes in the ExecutionGraph are conceptually straight-forward: 
> instead of going through the list of result consumers and scheduling them, we 
> need to be able to run a more general action. For normal operation, this will 
> still schedule the consumer task, but we can also configure it to kick of the 
> next source etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to