[
https://issues.apache.org/jira/browse/FLINK-32111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17723377#comment-17723377
]
Tamir Sagi commented on FLINK-32111:
------------------------------------
sure,
I already created MR.
[https://github.com/apache/flink-kubernetes-operator/pull/603|https://github.com/apache/flink-kubernetes-operator/pull/603/commits]
I tested it locally, it worked.
> Replacing cluster in failed state with a new one failed
> -------------------------------------------------------
>
> Key: FLINK-32111
> URL: https://issues.apache.org/jira/browse/FLINK-32111
> Project: Flink
> Issue Type: Bug
> Components: Kubernetes Operator
> Affects Versions: kubernetes-operator-1.5.0, kubernetes-operator-1.6.0
> Reporter: Tamir Sagi
> Priority: Major
> Attachments: operator-error.txt
>
>
> I deployed a problematic cluster(HA enabled with 3 JMs) to check cluster
> updates process. The cluster was in restart loops. Then I provided a newer
> CRD (Updated several configurations) and expected the cluster to get
> re-deployed. however the following exception happened
>
> Caused by: java.lang.NullPointerException
> at
> org.apache.flink.kubernetes.operator.service.CheckpointHistoryWrapper.getInProgressCheckpoint(CheckpointHistoryWrapper.java:60)
>
> at
> org.apache.flink.kubernetes.operator.service.AbstractFlinkService.getCheckpointInfo(AbstractFlinkService.java:564)
>
> at
> org.apache.flink.kubernetes.operator.service.AbstractFlinkService.getLastCheckpoint(AbstractFlinkService.java:520)
>
> at
> org.apache.flink.kubernetes.operator.observer.SavepointObserver.observeLatestSavepoint(SavepointObserver.java:209)
>
> at
> org.apache.flink.kubernetes.operator.observer.SavepointObserver.observeSavepointStatus(SavepointObserver.java:73)
>
> at
> org.apache.flink.kubernetes.operator.observer.deployment.ApplicationObserver.observeFlinkCluster(ApplicationObserver.java:61)
>
> at
> org.apache.flink.kubernetes.operator.observer.deployment.AbstractFlinkDeploymentObserver.observeInternal(AbstractFlinkDeploymentObserver.java:73)
>
> at
> org.apache.flink.kubernetes.operator.observer.AbstractFlinkResourceObserver.observe(AbstractFlinkResourceObserver.java:53)
>
> at
> org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:134)
>
>
> upgradeMode was first `last-state` and then I changed it to `stateless` but
> it still did not deploy the new cluster.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)