[ https://issues.apache.org/jira/browse/FLINK-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062282#comment-16062282 ]
ASF GitHub Bot commented on FLINK-6742: --------------------------------------- Github user gyfora commented on a diff in the pull request: https://github.com/apache/flink/pull/4083#discussion_r123895607 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/savepoint/SavepointV2.java --- @@ -168,10 +168,27 @@ public static Savepoint convertToOperatorStateSavepointV2( expandedToLegacyIds = true; } + if (jobVertex == null) { + throw new IllegalStateException( + "Could not find task for state with ID " + taskState.getJobVertexID() + ". " + + "When migrating a savepoint from a version < 1.3 please make sure that the topology was not " + + "changed through removal of a stateful operator or modification of a chain containing a stateful " + + "operator."); + } + List<OperatorID> operatorIDs = jobVertex.getOperatorIDs(); for (int subtaskIndex = 0; subtaskIndex < jobVertex.getParallelism(); subtaskIndex++) { - SubtaskState subtaskState = taskState.getState(subtaskIndex); + SubtaskState subtaskState; + try { + subtaskState = taskState.getState(subtaskIndex); --- End diff -- Sorry for commenting late on this but I have had some major migration issues in the last few days :D I think we should explicitly compare parallelism instead of relying on the error: if (taskState.getStates().size() != jobVertex.getParallelism()) --> error Otherwise this will not fail on lower parallelism. > Improve error message when savepoint migration fails due to task removal > ------------------------------------------------------------------------ > > Key: FLINK-6742 > URL: https://issues.apache.org/jira/browse/FLINK-6742 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing > Affects Versions: 1.3.0 > Reporter: Gyula Fora > Assignee: Chesnay Schepler > Priority: Minor > Labels: flink-rel-1.3.1-blockers > > Caused by: java.lang.NullPointerException > at > org.apache.flink.runtime.checkpoint.savepoint.SavepointV2.convertToOperatorStateSavepointV2(SavepointV2.java:171) > at > org.apache.flink.runtime.checkpoint.savepoint.SavepointLoader.loadAndValidateSavepoint(SavepointLoader.java:75) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1090) -- This message was sent by Atlassian JIRA (v6.4.14#64029)