Myasuka commented on issue #7009: [FLINK-10712] Support to restore state when using RestartPipelinedRegionStrategy URL: https://github.com/apache/flink/pull/7009#issuecomment-455623323 @StefanRRichter I have went through all production code using union state as below: For `ArtificalOperatorStateMapper`, `SimpleEndlessSourceWithBloatedState `, and `StateCreatingFlatMap`, they just use union state for end-to-end test verifying. For `FlinkKafkaProducer` and `FlinkKafkaProducer011`, the union `nextTransactionalIdHintState` is just the same for all sub-tasks. For `KafkaConsumer`, since Kafaka does support to decrease partition counts, the kafka partition is sticky to the subtask when parallslism not changed. I think above operators would not meet conflict case. For `SequenceGeneratorSource`, it would initialize state with max event time. However, `monotonousEventTime` only increase by fixed step of `eventTimeClockProgressPerEvent`, which should be the same for all sub-tasks. For `StreamingFileSink`, it also get the max counter, but from the defination of `StreamingFileSink`, when restoring from checkpoint, the restored files in `pending` state are transferred into `finished` state, while `in-progress` files are rolled back. In other words from my point, partial recovery is not suitable for `StreamingFileSink`. From my point of view, all operators in production code with union state, except `StreamingFileSink`, should work fine for partial restore. However, since the `getUnionList` API is public for users, we cannot control users' behavior. In a nutshell, if we support to restore state when using `RestartPipelinedRegionStrategy`, we should add limitation for union state. I plan to add another parameter, which might be `RecoveryMode`, in the `restoreLatestCheckpointedState` method. For `RestartAllStrategy` and overall only one region's `RestartPipelinedRegionStrategy`, it's `RecoveryMode.ALL`; for other failover strategies, it's `RecoveryMode.PARTIAL`. By means of this, when assigning state, if we found `RecoveryMode.PARTIAL` and union state existed, unsupported exception could be thrown out. What do you think?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
