fredia commented on code in PR #20420:
URL: https://github.com/apache/flink/pull/20420#discussion_r937394800
##########
flink-tests/src/test/java/org/apache/flink/test/checkpointing/RescaleCheckpointManuallyITCase.java:
##########
@@ -150,54 +137,33 @@ public void testCheckpointRescalingKeyedState(boolean
scaleOut) throws Exception
numberKeys,
numberElements2,
numberElements + numberElements2,
- client,
+ miniCluster,
checkpointPath);
}
- private static String runJobAndGetCheckpoint(
+ private String runJobAndGetCheckpoint(
int numberKeys,
int numberElements,
int parallelism,
int maxParallelism,
- ClusterClient<?> client,
- File checkpointDir)
+ MiniCluster miniCluster)
throws Exception {
try {
- Duration timeout = Duration.ofMinutes(5);
- Deadline deadline = Deadline.now().plus(timeout);
-
JobGraph jobGraph =
createJobGraphWithKeyedState(
- parallelism, maxParallelism, numberKeys,
numberElements, false, 100);
- client.submitJob(jobGraph).get();
-
- assertTrue(
- SubtaskIndexFlatMapper.workCompletedLatch.await(
- deadline.timeLeft().toMillis(),
TimeUnit.MILLISECONDS));
-
- // verify the current state
- Set<Tuple2<Integer, Integer>> actualResult =
CollectionSink.getElementsSet();
-
- Set<Tuple2<Integer, Integer>> expectedResult = new HashSet<>();
-
- for (int key = 0; key < numberKeys; key++) {
- int keyGroupIndex =
KeyGroupRangeAssignment.assignToKeyGroup(key, maxParallelism);
- expectedResult.add(
- Tuple2.of(
-
KeyGroupRangeAssignment.computeOperatorIndexForKeyGroup(
- maxParallelism, parallelism,
keyGroupIndex),
- numberElements * key));
- }
-
- assertEquals(expectedResult, actualResult);
-
- // ensure contents of state within SubtaskIndexFlatMapper are all
included
- // in the last checkpoint (refer to FLINK-26882 for more details).
-
cluster.getMiniCluster().triggerCheckpoint(jobGraph.getJobID()).get();
-
- client.cancel(jobGraph.getJobID()).get();
- TestUtils.waitUntilJobCanceled(jobGraph.getJobID(), client);
- return
TestUtils.getMostRecentCompletedCheckpoint(checkpointDir).getAbsolutePath();
+ parallelism,
+ maxParallelism,
+ numberKeys,
+ numberElements,
+ numberElements,
+ true,
+ 100);
+ miniCluster.submitJob(jobGraph).get();
+ miniCluster.requestJobResult(jobGraph.getJobID()).get();
+ // The elements may not all be sent to sink when unaligned
checkpoints enabled(refer to
+ // FLINK-26882 for more details).
+ // Don't verify current state here.
+ return getLatestCompletedCheckpointPath(jobGraph.getJobID(),
miniCluster).get();
Review Comment:
> Thus, we might not ensure all the data stored in the RocksDB
state-backend, some data might still stay in the channel
Yes, so I delete the verification before restoring.
> the returned result of getLatestCompletedCheckpointPath might be empty.
If the result is empty, it would throw an exception here. And I think the
probability is very small. Because:
1. `NotifyingDefiniteKeySource` wouldn't terminate until one checkpoint is
completed.
1. `getLatestCompletedCheckpointPath` uses
`CheckpointStatsSnapshot.getHistory()` to get checkpoint path, instead of
listing files under checkpointDir, as long as checkpoint is put into
`CheckpointStatsHistory` after completion, `getLatestCompletedCheckpointPath`
can get it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]