[
https://issues.apache.org/jira/browse/FLINK-22684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346027#comment-17346027
]
Anton Kalashnikov commented on FLINK-22684:
-------------------------------------------
[~pnowojski] , [~roman_khachatryan] A couple of question:
* Should the new parameter be part of CheckpointConfig or SavepointConfig?
According to the initial problem, it should be SavepointConfig but I see here
naming problem. I mean Savepoint in fact doesn't contain any in-flight data and
it will be strange if SavepointConfig has ignoreInFlightData property.
* Should this new property be something more complicated than just a boolean?
For example, it can be some complex property that allows ignoring in-flight
data only for specific operator/subtask. but initially, we can implement only
two options NONE or ALL.
* How expensive is it in general to load metadata of the in-flight data? I
mean, initially, I thought it would make sense to load all the metadata as
usual and then, inside the CheckpointCoordinator, do some transformations as
needed. But now I think it might be expensive and it might be better to move
this logic deeper and not even load it from the storage.
> Add the ability to ignore in-flight data on recovery
> ----------------------------------------------------
>
> Key: FLINK-22684
> URL: https://issues.apache.org/jira/browse/FLINK-22684
> Project: Flink
> Issue Type: Improvement
> Reporter: Anton Kalashnikov
> Priority: Major
>
> The main case:
> * We want to restore the last unaligned checkpoint.
> * In-flight data of this checkpoint is corrupted.
> * We want to ignore this corrupted data and restore only states.
> The idea is having new configuration parameter('ignoreInFlightDataOnRecovery'
> or similar). and If it set to true, ignore the metadata of in-flight data on
> the Checkpoint Coordinator side.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)