[
https://issues.apache.org/jira/browse/FLINK-38033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17988097#comment-17988097
]
Salva edited comment on FLINK-38033 at 7/3/25 6:41 AM:
-------------------------------------------------------
[~gyfora] Yep, I have the new snapshot CRs enabled since v1.10 as noted before
(as well as periodic savepoints). Most probably the bug was introduced
[here|https://flink.apache.org/2025/06/03/apache-flink-kubernetes-operator-1.12.0-release-announcement/#bug-fixes-and-stability-enhancements]:
* {*}Savepoint Information Update{*}: Fixed a bug where upgrade savepoints
were not added to the deprecated {{{}savepointInfo{}}}, ensuring accurate
tracking of savepoints during upgrades.
I just saw your change, couldn't test it but looks good. Just left this minor
[comment|https://github.com/apache/flink-kubernetes-operator/pull/987/commits/da8745f409babe0f6538d55fbac3f4340b96f59a#r2181918327]
(after the fact, sorry!).
Anyway, thanks for the quick fix in absence of complete information! :)
was (Author: JIRAUSER287051):
[~gyfora] Yep, I have the new snapshot CRs enabled since v1.10 as noted before
(as well as periodic savepoints). Most probably the bug was introduced
[here|https://flink.apache.org/2025/06/03/apache-flink-kubernetes-operator-1.12.0-release-announcement/#bug-fixes-and-stability-enhancements]:
* {*}Savepoint Information Update{*}: Fixed a bug where upgrade savepoints
were not added to the deprecated {{{}savepointInfo{}}}, ensuring accurate
tracking of savepoints during upgrades.
I just saw your change, couldn't test it but looks good. Thanks for the quick
fix in absence of complete information! :)
> Job upgrades fail for operator v1.12
> ------------------------------------
>
> Key: FLINK-38033
> URL: https://issues.apache.org/jira/browse/FLINK-38033
> Project: Flink
> Issue Type: Bug
> Components: Autoscaler, Kubernetes Operator
> Affects Versions: kubernetes-operator-1.12.0
> Environment: * Flink 1.18.1
> * Flink Kubernetes Operator 1.12
> Reporter: Salva
> Assignee: Gyula Fora
> Priority: Critical
> Labels: pull-request-available
> Fix For: kubernetes-operator-1.13.0, kubernetes-operator-1.12.1
>
>
> I was running Flink Kubernetes Operator 1.11 and everything was fine.
> However, after upgrading to 1.12, the operator is misbehaving like this
> during job upgrades:
> * First, it takes a savepoint as usual
> * Then it uploads it to S3 as usual too
> * ...but after that, for some reason, it disposes/deletes it!
> * This makes the job upgrade fail
> I downgraded the operator to 1.11 and job upgrades started to work again.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)