Lijie Wang created FLINK-33968:
----------------------------------

             Summary: Compute the number of subpartitions when initializing 
executon job vertices
                 Key: FLINK-33968
                 URL: https://issues.apache.org/jira/browse/FLINK-33968
             Project: Flink
          Issue Type: Sub-task
          Components: Runtime / Coordination
            Reporter: Lijie Wang
            Assignee: Lijie Wang


Currently, when using dynamic graphs, the subpartition-num of a task is lazily 
calculated until the task deployment moment, this may lead to some 
uncertainties in job recovery scenarios.

Before jm crashs, when deploying upstream tasks, the parallelism of downstream 
vertex may be unknown, so the subpartiton-num will be the max parallelism of 
downstream job vertex. However, after jm restarts, when deploying upstream 
tasks, the parallelism of downstream job vertex may be known(has been 
calculated before jm crashs and been recovered after jm restarts), so the 
subpartiton-num will be the actual parallelism of downstream job vertex.
 
The difference of calculated subpartition-num will lead to the partitions 
generated before jm crashs cannot be reused after jm restarts.

We will solve this problem by advancing the calculation of subpartitoin-num to 
the moment of initializing executon job vertex (in ctor of 
IntermediateResultPartition)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to