[
https://issues.apache.org/jira/browse/FLINK-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736826#comment-14736826
]
ASF GitHub Bot commented on FLINK-2119:
---------------------------------------
Github user uce commented on the pull request:
https://github.com/apache/flink/pull/754#issuecomment-138907431
It's done on my side. The question is whether it makes sense to merge this
w/o any plans to continue on the batch scheduling. My personal answer is no. I
would close this and then re-open it when there is a need for it.
> Add ExecutionGraph support for leg-wise scheduling
> --------------------------------------------------
>
> Key: FLINK-2119
> URL: https://issues.apache.org/jira/browse/FLINK-2119
> Project: Flink
> Issue Type: Improvement
> Components: Scheduler
> Affects Versions: master
> Reporter: Ufuk Celebi
> Assignee: Ufuk Celebi
>
> Scheduling currently happens by lazily unrolling the ExecutionGraph from the
> sources.
> 1. All sources are scheduled for execution.
> 2. Their results trigger scheduling and deployment of the receiving tasks
> (either on the first available buffer or when all are produced [pipelined vs.
> blocking exchange]).
> For certain batch jobs this can be problematic as many tasks will be running
> at the same time and consume task manager resources like executionslots and
> memory. For these jobs, it is desirable to schedule the ExecutionGraph in
> with different strategies.
> With respect to the ExecutionGraph, the current limitation is that data
> availability for a result always triggers scheduling of the consuming tasks.
> This needs to be more general to allow different scheduling strategies.
> Consider the following example:
> {code}
> [ union ]
> / \
> / \
> [ source 1 ] [ source 2 ]
> {code}
> Currently, both sources are scheduled concurrently and the "faster" one
> triggers scheduling of the union. It is desirable to first allow source 1 to
> completly produce its result, then trigger scheduling of source 2, and only
> then schedule the union.
> The required changes in the ExecutionGraph are conceptually straight-forward:
> instead of going through the list of result consumers and scheduling them, we
> need to be able to run a more general action. For normal operation, this will
> still schedule the consumer task, but we can also configure it to kick of the
> next source etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)