klion26 opened a new pull request #8617: [FLINK-12619][StateBackend]Support 
TERMINATE/SUSPEND Job with Checkpoint
URL: https://github.com/apache/flink/pull/8617
 
 
   ## What is the purpose of the change
   
   Inspired by the idea of 
[FLINK-11458](https://issues.apache.org/jira/browse/FLINK-11458), we propose to 
support terminate/suspend a job with checkpoint. This improvement cooperates 
with incremental and external checkpoint features, that if checkpoint is 
retained and this feature is configured, we will trigger a checkpoint before 
the job stops. It could accelarate job recovery a lot since:
   1. No source rewinding required any more.
   2. It's much faster than taking a savepoint since incremental checkpoint is 
enabled.
   
   Please note that conceptually savepoints is different from checkpoint in a 
similar way that backups are different from recovery logs in traditional 
database systems. So we suggest using this feature only for job recovery, while 
stick with FLINK-11458 for the 
upgrading/cross-cluster-job-migration/state-backend-switch cases.
   
   The current commit does not include rest API(will do this by follow up issue 
 [FLINK-12733](https://issues.apache.org/jira/browse/FLINK-12733).
   ## Verifying this change
   
   This change added tests and can be verified as follows:
     - CliFrontendStopTest.java#{stopWithCheckpoint, 
testStopWithCheckpointWithMaxWM, testStopWithCheckpointAndSavepoint}
     - CheckpointPropertiesTest.java#testSyncCheckpoint
     - CheckpointTypeTest.java #testOrdinalsAreConstant
     - PendingCheckpointTest.java #testSyncCheckpointCannotBeSubsumed
     - CheckpointSerializationTest.java 
     - EventSerializerTest.java #testIsEvent
     - SynchronousCheckpointITCase.java
     - SynchronousCheckpointTest.java
    
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
     - The serializers: (no)
     - The runtime per-record code paths (performance sensitive): (no)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
     - The S3 file system connector: (no)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes)
     - If yes, how is the feature documented? (JavaDocs)
   @aljoscha @StefanRRichter 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to