[ 
https://issues.apache.org/jira/browse/FLINK-38033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17988097#comment-17988097
 ] 

Salva edited comment on FLINK-38033 at 7/3/25 6:41 AM:
-------------------------------------------------------

[~gyfora] Yep, I have the new snapshot CRs enabled since v1.10 as noted before 
(as well as periodic savepoints). Most probably the bug was introduced 
[here|https://flink.apache.org/2025/06/03/apache-flink-kubernetes-operator-1.12.0-release-announcement/#bug-fixes-and-stability-enhancements]:
 * {*}Savepoint Information Update{*}: Fixed a bug where upgrade savepoints 
were not added to the deprecated {{{}savepointInfo{}}}, ensuring accurate 
tracking of savepoints during upgrades.

I just saw your change, couldn't test it but looks good. Just left this minor 
[comment|https://github.com/apache/flink-kubernetes-operator/pull/987/commits/da8745f409babe0f6538d55fbac3f4340b96f59a#r2181918327]
 (after the fact, sorry!).

Anyway, thanks for the quick fix in absence of complete information! :)


was (Author: JIRAUSER287051):
[~gyfora] Yep, I have the new snapshot CRs enabled since v1.10 as noted before 
(as well as periodic savepoints). Most probably the bug was introduced 
[here|https://flink.apache.org/2025/06/03/apache-flink-kubernetes-operator-1.12.0-release-announcement/#bug-fixes-and-stability-enhancements]:
 * {*}Savepoint Information Update{*}: Fixed a bug where upgrade savepoints 
were not added to the deprecated {{{}savepointInfo{}}}, ensuring accurate 
tracking of savepoints during upgrades.

I just saw your change, couldn't test it but looks good. Thanks for the quick 
fix in absence of complete information! :)

> Job upgrades fail for operator v1.12
> ------------------------------------
>
>                 Key: FLINK-38033
>                 URL: https://issues.apache.org/jira/browse/FLINK-38033
>             Project: Flink
>          Issue Type: Bug
>          Components: Autoscaler, Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.12.0
>         Environment: * Flink 1.18.1
>  * Flink Kubernetes Operator 1.12 
>            Reporter: Salva
>            Assignee: Gyula Fora
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: kubernetes-operator-1.13.0, kubernetes-operator-1.12.1
>
>
> I was running Flink Kubernetes Operator 1.11 and everything was fine. 
> However, after upgrading to 1.12, the operator is misbehaving like this 
> during job upgrades:
>  * First, it takes a savepoint as usual
>  * Then it uploads it to S3 as usual too
>  * ...but after that, for some reason, it disposes/deletes it! 
>  * This makes the job upgrade fail
> I downgraded the operator to 1.11 and job upgrades started to work again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to