StephanEwen opened a new pull request #14186:
URL: https://github.com/apache/flink/pull/14186


   ## What is the purpose of the change
   
   Previously, an `OperatorCoordinator` (and thus also a Source 
`SplitEnumerator`) got no signal after a global recovery happened before in 
these cases:
     - The first successful checkpoint, because there was no checkpoint to 
restore to. Further mor
     - The Coordinator has not stored any state in the checkpoint.
   
   However, it is important to have a signal in that case, because the 
coordinators might need to reset their state.
   
   Now, 
   
   ## Brief change log
   
     - Minor adjustment in `CheckpointCoordinator` to differentiate between 
checkpoint/savepoint restore during JobManager startup (do nothing is no 
checkpoint or operator state is there) and checkpoint restore after failover 
(reset to empty state if no checkpoint or operator state available).
     - `CheckpointCoordinator` restores empty state to `OperatorCoordinators` 
in the case of a restore after a failure.
     - Sources ignore empty restore and let a new `SplitEnumerator` be created. 
The `RecreateOnResetCoordinator` already handles the creation of a new 
`SplitEnumerator` on each restore.
   
   ## Verifying this change
   
   This adds a unit test to `OperatorCoordinatorSchedulerTest` to verify the 
changed behavior.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): **no**
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
     - The serializers: **no**
     - The runtime per-record code paths (performance sensitive): **no**
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: **yes**
     - The S3 file system connector: **no**
   
   ## Documentation
   
     - Does this pull request introduce a new feature? **no**
     - If yes, how is the feature documented? **JavaDocs**
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to