[jira] [Commented] (FLINK-29609) Clean up jobmanager deployment on suspend after recording savepoint info

Sriram Ganesh (Jira) Fri, 28 Oct 2022 02:46:19 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-29609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625561#comment-17625561
 ]


Sriram Ganesh commented on FLINK-29609:
---------------------------------------

I found this place where we are not removing the JM pod. 
[https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-ope[…]che/flink/kubernetes/operator/service/AbstractFlinkService.java#L350.
 
|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java#L350]

But we can't remove the JM pod as it is. Because pod upgrades and rollback also 
will get impacted.

Can we have conditions like the pod can be removed after no action?. Any better 
suggestions will be appreciated. Thanks in advance.

> Clean up jobmanager deployment on suspend after recording savepoint info
> ------------------------------------------------------------------------
>
>                 Key: FLINK-29609
>                 URL: https://issues.apache.org/jira/browse/FLINK-29609
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kubernetes Operator
>            Reporter: Gyula Fora
>            Assignee: Sriram Ganesh
>            Priority: Major
>             Fix For: kubernetes-operator-1.3.0
>
>
> Currently in case of suspending with savepoint. The jobmanager pod will 
> linger there forever after cancelling the job.
> This is currently used to ensure consistency in case the 
> operator/cancel-with-savepoint operation fails.
> Once we are sure however that the savepoint has been recorded and the job is 
> shut down, we should clean up all the resources. Optionally we can make this 
> configurable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29609) Clean up jobmanager deployment on suspend after recording savepoint info

Reply via email to