Rui Fan created FLINK-39140:
-------------------------------
Summary: Enhance Unaligned Checkpoint ITCases to perform
checkpointing during recovery
Key: FLINK-39140
URL: https://issues.apache.org/jira/browse/FLINK-39140
Project: Flink
Issue Type: Sub-task
Reporter: Rui Fan
Assignee: Rui Fan
Current Unaligned Checkpoint ITCases only restart once from a normal
checkpoint. They do not cover restoring from a checkpoint produced by recovery
phase — which is the key scenario for checkpointing during recovery.
*Proposed mechanism:* After restoring from a checkpoint, wait for the first new
checkpoint to be produced, then immediately trigger a restart from it. Repeat
for a configurable number of rounds (≥ 2). Whether to rescale depends on the
specific test case.
This mechanism works on the current master (validating normal checkpoint
recovery). Once checkpointing during recovery is enabled, the same tests
automatically cover recovery-phase checkpoint scenarios.
h2. Affected ITCases
* UnalignedCheckpointRescaleITCase
* UnalignedCheckpointRescaleWithMixedExchangesITCase
* UnalignedCheckpointITCase
* UnalignedCheckpointCompatibilityITCase
* UnalignedCheckpointStressITCase
* UnalignedCheckpointFailureHandlingITCase
--
This message was sent by Atlassian Jira
(v8.20.10#820010)