[
https://issues.apache.org/jira/browse/FLINK-33968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhu Zhu closed FLINK-33968.
---------------------------
Fix Version/s: 1.19.0
Resolution: Done
master/1.19: b25dfaee80727d6662a5fd445fe51cc139a8b9eb
> Compute the number of subpartitions when initializing executon job vertices
> ---------------------------------------------------------------------------
>
> Key: FLINK-33968
> URL: https://issues.apache.org/jira/browse/FLINK-33968
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Reporter: Lijie Wang
> Assignee: Lijie Wang
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.19.0
>
>
> Currently, when using dynamic graphs, the subpartition-num of a task is
> lazily calculated until the task deployment moment, this may lead to some
> uncertainties in job recovery scenarios:
> Before jm crashs, when deploying upstream tasks, the parallelism of
> downstream vertex may be unknown, so the subpartiton-num will be the max
> parallelism of downstream job vertex. However, after jm restarts, when
> deploying upstream tasks, the parallelism of downstream job vertex may be
> known(has been calculated before jm crashs and been recovered after jm
> restarts), so the subpartiton-num will be the actual parallelism of
> downstream job vertex. The difference of calculated subpartition-num will
> lead to the partitions generated before jm crashs cannot be reused after jm
> restarts.
> We will solve this problem by advancing the calculation of subpartitoin-num
> to the moment of initializing executon job vertex (in ctor of
> IntermediateResultPartition)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)