[
https://issues.apache.org/jira/browse/FLINK-26577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511705#comment-17511705
]
Yang Wang commented on FLINK-26577:
-----------------------------------
Now the question is what is the correct behavior when upgrade mode changed.
Stateless -> Savepoint: trigger a savepoint and redeploy
Stateless -> LastState: trigger a savepoint and redeploy, HA should be enabled
Savepoint -> LastState: trigger a savepoint and redeploy, HA should be enabled
Savepoint -> Stateless: the state loss is expected? Or trigger a savepoint and
restore from it
LastState -> Stateless: the state loss is expected? Or keep the HA ConfigMap
and restore. Or trigger a savepoint and restore from it
LastState -> Savepoint: trigger a savepoint and redeploy since HA might be
disabled
> Avoid state loss when switching to last-state upgrade mode
> ----------------------------------------------------------
>
> Key: FLINK-26577
> URL: https://issues.apache.org/jira/browse/FLINK-26577
> Project: Flink
> Issue Type: Sub-task
> Components: Kubernetes Operator
> Reporter: Gyula Fora
> Assignee: Yang Wang
> Priority: Major
>
> At the moment there are several corner cases which can lead to accidental
> state loss (or at least weird behaviour) when switching to last-state upgrade
> mode from other modes.
> 2 cases that immediately come to mind:
> savepoint to last-state:
> When the new upgrade mode is last-state, the job deployment will simply be
> deleted. If HA was not enabled previously, the last savepoint might be very
> far back in time.
> stateless to last-state:
> If checkpointing and HA is not enabled, the deployment will simply be killed
> like previously and we might start a job from empty state. Maybe taking a
> savepoint would be the right approach in this case and continue from there.
> Maybe when switching between modes we should consider the previous mode as
> well as the target mode when deciding the on the suspend strategy. We could
> also simply not allow to switch to last-state if HA is not enabled previously
> but that might be too restrictive.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)