[
https://issues.apache.org/jira/browse/FLINK-12138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot updated FLINK-12138:
-----------------------------------
Labels: auto-unassigned stale-major (was: auto-unassigned)
I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help
the community manage its development. I see this issues has been marked as
Major but is unassigned and neither itself nor its Sub-Tasks have been updated
for 30 days. I have gone ahead and added a "stale-major" to the issue". If this
ticket is a Major, please either assign yourself or give an update. Afterwards,
please remove the label or in 7 days the issue will be deprioritized.
> Limit input split count of each source task for better failover experience
> --------------------------------------------------------------------------
>
> Key: FLINK-12138
> URL: https://issues.apache.org/jira/browse/FLINK-12138
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Task
> Affects Versions: 1.9.0
> Reporter: Zhu Zhu
> Priority: Major
> Labels: auto-unassigned, stale-major
>
> Flink currently use an InputSplitAssigner to dynamically assign input splits
> to source tasks. A task requests a new split after finishes processing the
> previous one. Thus to achieve a better load balance.
> However, in cases that the slots are fewer than the source tasks, only the
> first launched source tasks can request splits and it will last till all the
> splits are consumed. This is not failover friendly, as users sometimes
> intentionally set a larger parallelism to reduce the failover impact.
> For example, a job runs in an 10 slots session and it has an 1000 parallelism
> source vertex to consume 10000 splits, all vertices are not connected to
> others. Currently, 10 of 1000 source task will be launched and will only
> finish after all the input splits are consumed. If a task fails, at most
> ~1000 splits need to be re-processed. While if 1000 tasks can run at once,
> only ~10 splits needs to be re-processed.
>
> We's propose add a cap for the input splits count that each source task shall
> process. Once the cap is reached, the task cannot get any more split from the
> InputSplitAssigner and finishes then. Thus slot space can be made for other
> source tasks.
> Theoretically, it would be proper to set the cap to be max(Input split
> size)/avg(input split size).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)