dawidwys commented on a change in pull request #16655:
URL: https://github.com/apache/flink/pull/16655#discussion_r683172235



##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/FinishedOperatorSubtaskState.java
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.checkpoint;
+
+/**
+ * A specialized {@link OperatorSubtaskState} used to mark the finished 
subtasks in the snapshot of
+ * this operator.
+ */
+public class FinishedOperatorSubtaskState extends OperatorSubtaskState {
+
+    private static final long serialVersionUID = 7206415348825695023L;

Review comment:
       Just use `1L`. Please take a look at our coding guidelines for 
explanation.

##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/metadata/MetadataV3Serializer.java
##########
@@ -114,8 +115,14 @@ protected void serializeOperatorState(OperatorState 
operatorState, DataOutputStr
                     operatorState.getSubtaskStates();
             dos.writeInt(subtaskStateMap.size());
             for (Map.Entry<Integer, OperatorSubtaskState> entry : 
subtaskStateMap.entrySet()) {
-                dos.writeInt(entry.getKey());
-                serializeSubtaskState(entry.getValue(), dos);
+                if (entry.getValue().isFinished()) {
+                    // We store a negative index for the finished subtask. In 
consideration
+                    // of the index 0, the negative index would start from -1.
+                    dos.writeInt(-(entry.getKey() + 1));

Review comment:
       That's a bit too much magic for my taste :(
   
   It kind of worked for `Operator` where it meant the number of enclosed 
states. Here I find too complex especially with the offsetting logic. 
Unfortunately, I think we need to adjust the metadata version.

##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/FinishedOperatorSubtaskState.java
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.checkpoint;
+
+/**
+ * A specialized {@link OperatorSubtaskState} used to mark the finished 
subtasks in the snapshot of
+ * this operator.
+ */
+public class FinishedOperatorSubtaskState extends OperatorSubtaskState {

Review comment:
       Do we need a separate class? Could we just add a flag to the current 
class?

##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/PendingCheckpoint.java
##########
@@ -347,6 +347,7 @@ public CompletedCheckpoint finalizeCheckpoint(
                 }
 
                 fulfillFullyFinishedOperatorStates();
+                fulfillSubtaskStateForPartlyFinishedOperators();

Review comment:
       `Partly` -> `Partially`

##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/VertexFinishedStateChecker.java
##########
@@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.checkpoint;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.runtime.OperatorIDPair;
+import org.apache.flink.runtime.executiongraph.ExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.IntermediateResult;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.JobEdge;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.jobgraph.OperatorID;
+import org.apache.flink.util.FlinkRuntimeException;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * This class encapsulates the operation that checks if there are illegal 
modification to the
+ * JobGraph when restoring from a checkpoint with partly or fully finished 
operator states.
+ *
+ * <p>As a whole, it ensures
+ *
+ * <ol>
+ *   <li>All the operators inside a JobVertex have the same finished state.
+ *   <li>The predecessors of a fully finished vertex must also be fully 
finished.
+ *   <li>The processors of a partly finished vertex
+ *       <ul>
+ *         <li>If connected via ALL_TO_ALL edge, the predecessor must be fully 
finished.
+ *         <li>If connected via POINTWISE edge, the predecessor must be partly 
finished or fully
+ *             finished.
+ *       </ul>
+ * </ol>
+ */
+public class VertexFinishedStateChecker {
+
+    private final Set<ExecutionJobVertex> vertices;
+
+    private final Map<OperatorID, OperatorState> operatorStates;
+
+    public VertexFinishedStateChecker(
+            Set<ExecutionJobVertex> vertices, Map<OperatorID, OperatorState> 
operatorStates) {
+        this.vertices = vertices;
+        this.operatorStates = operatorStates;
+    }
+
+    public void validateOperatorsFinishedState() {
+        VerticesFinishedStatusCache verticesFinishedCache =
+                new VerticesFinishedStatusCache(operatorStates);
+        for (ExecutionJobVertex vertex : vertices) {
+            VertexFinishedState vertexFinishedState = 
verticesFinishedCache.getOrUpdate(vertex);
+
+            if (vertexFinishedState == VertexFinishedState.FULLY_FINISHED) {
+                checkProcessorsOfFullyFinishedVertex(vertex, 
verticesFinishedCache);

Review comment:
       Could you check the naming? What does the `processors` mean?

##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/VertexFinishedStateChecker.java
##########
@@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.checkpoint;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.runtime.OperatorIDPair;
+import org.apache.flink.runtime.executiongraph.ExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.IntermediateResult;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.JobEdge;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.jobgraph.OperatorID;
+import org.apache.flink.util.FlinkRuntimeException;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * This class encapsulates the operation that checks if there are illegal 
modification to the
+ * JobGraph when restoring from a checkpoint with partly or fully finished 
operator states.
+ *
+ * <p>As a whole, it ensures
+ *
+ * <ol>
+ *   <li>All the operators inside a JobVertex have the same finished state.
+ *   <li>The predecessors of a fully finished vertex must also be fully 
finished.
+ *   <li>The processors of a partly finished vertex
+ *       <ul>
+ *         <li>If connected via ALL_TO_ALL edge, the predecessor must be fully 
finished.
+ *         <li>If connected via POINTWISE edge, the predecessor must be partly 
finished or fully
+ *             finished.
+ *       </ul>
+ * </ol>
+ */
+public class VertexFinishedStateChecker {
+
+    private final Set<ExecutionJobVertex> vertices;
+
+    private final Map<OperatorID, OperatorState> operatorStates;
+
+    public VertexFinishedStateChecker(
+            Set<ExecutionJobVertex> vertices, Map<OperatorID, OperatorState> 
operatorStates) {
+        this.vertices = vertices;
+        this.operatorStates = operatorStates;
+    }
+
+    public void validateOperatorsFinishedState() {
+        VerticesFinishedStatusCache verticesFinishedCache =
+                new VerticesFinishedStatusCache(operatorStates);
+        for (ExecutionJobVertex vertex : vertices) {
+            VertexFinishedState vertexFinishedState = 
verticesFinishedCache.getOrUpdate(vertex);
+
+            if (vertexFinishedState == VertexFinishedState.FULLY_FINISHED) {
+                checkProcessorsOfFullyFinishedVertex(vertex, 
verticesFinishedCache);

Review comment:
       Could you check the naming? What does the `processors` mean? Did you 
mean `predecessors`?

##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/VertexFinishedStateChecker.java
##########
@@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.checkpoint;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.runtime.OperatorIDPair;
+import org.apache.flink.runtime.executiongraph.ExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.IntermediateResult;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.JobEdge;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.jobgraph.OperatorID;
+import org.apache.flink.util.FlinkRuntimeException;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * This class encapsulates the operation that checks if there are illegal 
modification to the
+ * JobGraph when restoring from a checkpoint with partly or fully finished 
operator states.
+ *
+ * <p>As a whole, it ensures
+ *
+ * <ol>
+ *   <li>All the operators inside a JobVertex have the same finished state.
+ *   <li>The predecessors of a fully finished vertex must also be fully 
finished.
+ *   <li>The processors of a partly finished vertex
+ *       <ul>
+ *         <li>If connected via ALL_TO_ALL edge, the predecessor must be fully 
finished.
+ *         <li>If connected via POINTWISE edge, the predecessor must be partly 
finished or fully
+ *             finished.
+ *       </ul>
+ * </ol>
+ */
+public class VertexFinishedStateChecker {
+
+    private final Set<ExecutionJobVertex> vertices;
+
+    private final Map<OperatorID, OperatorState> operatorStates;
+
+    public VertexFinishedStateChecker(
+            Set<ExecutionJobVertex> vertices, Map<OperatorID, OperatorState> 
operatorStates) {
+        this.vertices = vertices;
+        this.operatorStates = operatorStates;
+    }
+
+    public void validateOperatorsFinishedState() {
+        VerticesFinishedStatusCache verticesFinishedCache =
+                new VerticesFinishedStatusCache(operatorStates);
+        for (ExecutionJobVertex vertex : vertices) {
+            VertexFinishedState vertexFinishedState = 
verticesFinishedCache.getOrUpdate(vertex);
+
+            if (vertexFinishedState == VertexFinishedState.FULLY_FINISHED) {
+                checkProcessorsOfFullyFinishedVertex(vertex, 
verticesFinishedCache);
+            } else if (vertexFinishedState == 
VertexFinishedState.PARTLY_FINISHED) {
+                checkProcessorsOfPartlyFinishedVertex(vertex, 
verticesFinishedCache);
+            }
+        }
+    }
+
+    private void checkProcessorsOfFullyFinishedVertex(
+            ExecutionJobVertex vertex, VerticesFinishedStatusCache 
verticesFinishedStatusCache) {
+        boolean allPredecessorsFinished =
+                vertex.getInputs().stream()
+                        .map(IntermediateResult::getProducer)
+                        .allMatch(
+                                jobVertex ->
+                                        
verticesFinishedStatusCache.getOrUpdate(jobVertex)
+                                                == 
VertexFinishedState.FULLY_FINISHED);
+
+        if (!allPredecessorsFinished) {
+            throw new FlinkRuntimeException(
+                    "Illegal JobGraph modification. Cannot run a program with 
fully finished"
+                            + " vertices predeceased with the ones not fully 
finished. Task vertex "
+                            + vertex.getName()
+                            + "("
+                            + vertex.getJobVertexId()
+                            + ")"
+                            + " has a predecessor not fully finished");
+        }
+    }
+
+    private void checkProcessorsOfPartlyFinishedVertex(
+            ExecutionJobVertex vertex, VerticesFinishedStatusCache 
verticesFinishedStatusCache) {
+        // Computes the distribution pattern from the predecessors. If there 
are multiple edges,
+        // ALL_TO_ALL edges would have a higher priority.
+        Map<JobVertexID, DistributionPattern> predecessorDistribution = new 
HashMap<>();
+        for (JobEdge jobEdge : vertex.getJobVertex().getInputs()) {
+            predecessorDistribution.compute(
+                    jobEdge.getSource().getProducer().getID(),
+                    (k, v) ->
+                            v == DistributionPattern.ALL_TO_ALL
+                                    ? v
+                                    : jobEdge.getDistributionPattern());
+        }
+
+        for (IntermediateResult dataset : vertex.getInputs()) {
+            ExecutionJobVertex predecessor = dataset.getProducer();
+            VertexFinishedState predecessorState =
+                    verticesFinishedStatusCache.getOrUpdate(predecessor);
+            DistributionPattern distribution =
+                    predecessorDistribution.get(predecessor.getJobVertexId());
+
+            if (distribution == DistributionPattern.ALL_TO_ALL
+                    && predecessorState != VertexFinishedState.FULLY_FINISHED) 
{
+                throw new FlinkRuntimeException(
+                        "Illegal JobGraph modification. Cannot run a program 
with partly finished"
+                                + " vertices predeceased with running or 
partly finished ones and"
+                                + " connected via the ALL_TO_ALL edges. Task 
vertex "
+                                + vertex.getName()
+                                + "("
+                                + vertex.getJobVertexId()
+                                + ")"
+                                + " has a "
+                                + (predecessorState == 
VertexFinishedState.ALL_RUNNING
+                                        ? "all running"
+                                        : "partly finished")
+                                + " predecessor");
+            } else if (distribution == DistributionPattern.POINTWISE
+                    && predecessorState == VertexFinishedState.ALL_RUNNING) {
+                throw new FlinkRuntimeException(
+                        "Illegal JobGraph modification. Cannot run a program 
with partly finished"
+                                + " vertices predeceased with all running 
ones. Task vertex "
+                                + vertex.getName()
+                                + "("
+                                + vertex.getJobVertexId()
+                                + ")"
+                                + " has a all running predecessor");
+            }
+        }
+    }
+
+    @VisibleForTesting
+    enum VertexFinishedState {
+        ALL_RUNNING,
+        PARTLY_FINISHED,
+        FULLY_FINISHED
+    }
+
+    private static class VerticesFinishedStatusCache {
+        private final Map<OperatorID, OperatorState> operatorStates;
+        private final Map<JobVertexID, VertexFinishedState> finishedCache = 
new HashMap<>();
+
+        private VerticesFinishedStatusCache(Map<OperatorID, OperatorState> 
operatorStates) {
+            this.operatorStates = operatorStates;
+        }
+
+        public VertexFinishedState getOrUpdate(ExecutionJobVertex vertex) {
+            return finishedCache.computeIfAbsent(
+                    vertex.getJobVertexId(),
+                    ignored -> calculateFinishedState(vertex, operatorStates));
+        }
+
+        private VertexFinishedState calculateFinishedState(
+                ExecutionJobVertex vertex, Map<OperatorID, OperatorState> 
operatorStates) {
+            Set<VertexFinishedState> operatorFinishedStates =
+                    vertex.getOperatorIDs().stream()
+                            .map(idPair -> 
checkOperatorFinishedStatus(operatorStates, idPair))
+                            .collect(Collectors.toSet());
+            if (operatorFinishedStates.size() > 1) {

Review comment:
       nit: `!= 1`?

##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/metadata/MetadataV3Serializer.java
##########
@@ -114,8 +115,14 @@ protected void serializeOperatorState(OperatorState 
operatorState, DataOutputStr
                     operatorState.getSubtaskStates();
             dos.writeInt(subtaskStateMap.size());
             for (Map.Entry<Integer, OperatorSubtaskState> entry : 
subtaskStateMap.entrySet()) {
-                dos.writeInt(entry.getKey());
-                serializeSubtaskState(entry.getValue(), dos);
+                if (entry.getValue().isFinished()) {
+                    // We store a negative index for the finished subtask. In 
consideration
+                    // of the index 0, the negative index would start from -1.
+                    dos.writeInt(-(entry.getKey() + 1));

Review comment:
       @pnowojski @StephanEwen What do you think about adjusting the metadata 
format to include the finished flag?

##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/VertexFinishedStateChecker.java
##########
@@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.checkpoint;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.runtime.OperatorIDPair;
+import org.apache.flink.runtime.executiongraph.ExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.IntermediateResult;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.JobEdge;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.jobgraph.OperatorID;
+import org.apache.flink.util.FlinkRuntimeException;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * This class encapsulates the operation that checks if there are illegal 
modification to the
+ * JobGraph when restoring from a checkpoint with partly or fully finished 
operator states.
+ *
+ * <p>As a whole, it ensures
+ *
+ * <ol>
+ *   <li>All the operators inside a JobVertex have the same finished state.
+ *   <li>The predecessors of a fully finished vertex must also be fully 
finished.
+ *   <li>The processors of a partly finished vertex
+ *       <ul>
+ *         <li>If connected via ALL_TO_ALL edge, the predecessor must be fully 
finished.
+ *         <li>If connected via POINTWISE edge, the predecessor must be partly 
finished or fully
+ *             finished.
+ *       </ul>
+ * </ol>
+ */
+public class VertexFinishedStateChecker {
+
+    private final Set<ExecutionJobVertex> vertices;
+
+    private final Map<OperatorID, OperatorState> operatorStates;
+
+    public VertexFinishedStateChecker(
+            Set<ExecutionJobVertex> vertices, Map<OperatorID, OperatorState> 
operatorStates) {
+        this.vertices = vertices;
+        this.operatorStates = operatorStates;
+    }
+
+    public void validateOperatorsFinishedState() {
+        VerticesFinishedStatusCache verticesFinishedCache =
+                new VerticesFinishedStatusCache(operatorStates);
+        for (ExecutionJobVertex vertex : vertices) {
+            VertexFinishedState vertexFinishedState = 
verticesFinishedCache.getOrUpdate(vertex);
+
+            if (vertexFinishedState == VertexFinishedState.FULLY_FINISHED) {
+                checkProcessorsOfFullyFinishedVertex(vertex, 
verticesFinishedCache);
+            } else if (vertexFinishedState == 
VertexFinishedState.PARTLY_FINISHED) {
+                checkProcessorsOfPartlyFinishedVertex(vertex, 
verticesFinishedCache);
+            }
+        }
+    }
+
+    private void checkProcessorsOfFullyFinishedVertex(
+            ExecutionJobVertex vertex, VerticesFinishedStatusCache 
verticesFinishedStatusCache) {
+        boolean allPredecessorsFinished =
+                vertex.getInputs().stream()
+                        .map(IntermediateResult::getProducer)
+                        .allMatch(
+                                jobVertex ->
+                                        
verticesFinishedStatusCache.getOrUpdate(jobVertex)
+                                                == 
VertexFinishedState.FULLY_FINISHED);
+
+        if (!allPredecessorsFinished) {
+            throw new FlinkRuntimeException(
+                    "Illegal JobGraph modification. Cannot run a program with 
fully finished"
+                            + " vertices predeceased with the ones not fully 
finished. Task vertex "
+                            + vertex.getName()
+                            + "("
+                            + vertex.getJobVertexId()
+                            + ")"
+                            + " has a predecessor not fully finished");
+        }
+    }
+
+    private void checkProcessorsOfPartlyFinishedVertex(
+            ExecutionJobVertex vertex, VerticesFinishedStatusCache 
verticesFinishedStatusCache) {
+        // Computes the distribution pattern from the predecessors. If there 
are multiple edges,
+        // ALL_TO_ALL edges would have a higher priority.
+        Map<JobVertexID, DistributionPattern> predecessorDistribution = new 
HashMap<>();
+        for (JobEdge jobEdge : vertex.getJobVertex().getInputs()) {
+            predecessorDistribution.compute(
+                    jobEdge.getSource().getProducer().getID(),
+                    (k, v) ->
+                            v == DistributionPattern.ALL_TO_ALL

Review comment:
       Is it for cases when a vertex is connected twice to the current vertex? 
If so, could you add such a comment here?
   

##########
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinatorRestoringTest.java
##########
@@ -1213,31 +1213,153 @@ public void 
testRestoringPartiallyFinishedChainsFails() throws Exception {
                         + "anon("
                         + jobVertexID1
                         + ")"
-                        + " which contain both finished and unfinished 
operators");
+                        + " which contain mixed operator finished state: 
[ALL_RUNNING, FULLY_FINISHED]");
         coord.restoreLatestCheckpointedStateToAll(vertices, false);
     }
 
     @Test
     public void testAddingRunningOperatorBeforeFinishedOneFails() throws 
Exception {

Review comment:
       Could we also have a test case for the situation that a single operator 
is connected via two different distribution patterns? That is quite an uncommon 
and very specific scenario that in my mind is worth checking.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to