[ https://issues.apache.org/jira/browse/FLINK-22262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17321885#comment-17321885 ]
Yang Wang commented on FLINK-22262: ----------------------------------- Could you please share the JobManager logs when you cancel the Flink job successfully and still have the residual ConfigMaps? I think you could use {{kubectl logs podname}} to get the logs. I have used the following steps to start/stop Flink applications on K8s with HA enable in my minikube. And it works well. 1. Start the native Flink K8a application {code:java} $FLINK_HOME/bin/flink run-application -d -t kubernetes-application \ -Dkubernetes.cluster-id=$CLUSTER_ID \ -Dkubernetes.namespace=$NAMESPACE \ -Dkubernetes.container.image=wangyang09180523/flink:1.13.0-rc0 \ -Dkubernetes.container.image.pull-policy=Always \ -Dkubernetes.rest-service.exposed.type=NodePort \ -Dkubernetes.jobmanager.cpu=0.5 -Djobmanager.memory.process.size=1700m \ -Dkubernetes.jobmanager.service-account=default \ -Dkubernetes.taskmanager.cpu=0.5 -Dtaskmanager.memory.process.size=1500m -Dtaskmanager.numberOfTaskSlots=4 \ -Dstate.checkpoints.dir=$HA_STORAGE \ -Dhigh-availability=org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory \ -Dhigh-availability.storageDir=$HA_STORAGE \ -Drestart-strategy=fixed-delay -Drestart-strategy.fixed-delay.attempts=1 \ -Dcontainerized.master.env.ENABLE_BUILT_IN_PLUGINS=flink-oss-fs-hadoop-1.13.0.jar -Dcontainerized.taskmanager.env.ENABLE_BUILT_IN_PLUGINS=flink-oss-fs-hadoop-1.13.0.jar \ -Dstate.savepoints.dir=$HA_STORAGE \ local:///opt/flink/examples/streaming/StateMachineExample.jar {code} 2. Cancel the Flink job with savepoint. All the K8s resources will be deleted. I do not find residual HA ConfigMaps after canceled successfully. {code:java} ./bin/flink cancel --target kubernetes-application --withSavepoint -Dkubernetes.cluster-id=k8s-app-ha-1-113-rc1 -Dkubernetes.namespace=default 00000000000000000000000000000000 ... ... Cancelled job 00000000000000000000000000000000. Savepoint stored in oss://flink-debug-yiqi/flink-ha/savepoint-000000-8741523cb1d1. {code} 3. Maybe change the user codes and resubmit the Flink application with stored savepoint {code:java} $FLINK_HOME/bin/flink run-application -d -t kubernetes-application \ --fromSavepoint oss://flink-debug-yiqi/flink-ha/savepoint-000000-8741523cb1d1 \ ... ... local:///opt/flink/examples/streaming/StateMachineExample.jar{code} > Flink on Kubernetes ConfigMaps are created without OwnerReference > ----------------------------------------------------------------- > > Key: FLINK-22262 > URL: https://issues.apache.org/jira/browse/FLINK-22262 > Project: Flink > Issue Type: Bug > Affects Versions: 1.13.0 > Reporter: Andrea Peruffo > Priority: Major > > According to the documentation: > [https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#manual-resource-cleanup] > The ConfigMaps created along with the Flink deployment is supposed to have an > OwnerReference pointing to the Deployment itself, unfortunately, this doesn't > happen and causes all sorts of issues when the classpath and the jars of the > job are updated. > i.e.: > Without manually removing the ConfigMap of the Job I cannot update the Jars > of the Job. > Can you please give guidance if there are additional caveats on manually > removing the ConfigMap? Any other workaround that can be used? > Thanks in advance. > Example ConfigMap: > {{apiVersion: v1}} > {{data:}} > {{ address: akka.tcp://flink@10.0.2.13:6123/user/rpc/jobmanager_2}} > {{ checkpointID-0000000000000000049: > rO0ABXNyADtvcmcuYXBhY2hlLmZsaW5rLnJ1bnRpbWUuc3RhdGUuUmV0cmlldmFibGVTdHJlYW1TdGF0ZUhhbmRsZQABHhjxVZcrAgABTAAYd3JhcHBlZFN0cmVhbVN0YXRlSGFuZGxldAAyTG9yZy9hcGFjaGUvZmxpbmsvcnVudGltZS9zdGF0ZS9TdHJlYW1TdGF0ZUhhbmRsZTt4cHNyADlvcmcuYXBhY2hlLmZsaW5rLnJ1bnRpbWUuc3RhdGUuZmlsZXN5c3RlbS5GaWxlU3RhdGVIYW5kbGUE3HXYYr0bswIAAkoACXN0YXRlU2l6ZUwACGZpbGVQYXRodAAfTG9yZy9hcGFjaGUvZmxpbmsvY29yZS9mcy9QYXRoO3hwAAAAAAABOEtzcgAdb3JnLmFwYWNoZS5mbGluay5jb3JlLmZzLlBhdGgAAAAAAAAAAQIAAUwAA3VyaXQADkxqYXZhL25ldC9VUkk7eHBzcgAMamF2YS5uZXQuVVJJrAF4LkOeSasDAAFMAAZzdHJpbmd0ABJMamF2YS9sYW5nL1N0cmluZzt4cHQAUC9tbnQvZmxpbmsvc3RvcmFnZS9rc2hhL3RheGktcmlkZS1mYXJlLXByb2Nlc3Nvci9jb21wbGV0ZWRDaGVja3BvaW50MDQ0YTc2OWRkNDgxeA==}} > {{ counter: "50"}} > {{ sessionId: 0c2b69ee-6b41-48d3-b7fd-1bf2eda94f0f}} > {{kind: ConfigMap}} > {{metadata:}} > {{ annotations:}} > {{ control-plane.alpha.kubernetes.io/leader: > '\{"holderIdentity":"0f25a2cc-e212-46b0-8ba9-faac0732a316","leaseDuration":15.000000000,"acquireTime":"2021-04-13T14:30:51.439000Z","renewTime":"2021-04-13T14:39:32.011000Z","leaderTransitions":105}'}} > {{ creationTimestamp: "2021-04-13T14:30:51Z"}} > {{ labels:}} > {{ app: taxi-ride-fare-processor}} > {{ configmap-type: high-availability}} > {{ type: flink-native-kubernetes}} > {{ name: > taxi-ride-fare-processor-00000000000000000000000000000000-jobmanager-leader}} > {{ namespace: taxi-ride-fare}} > {{ resourceVersion: "64100"}} > {{ selfLink: > /api/v1/namespaces/taxi-ride-fare/configmaps/taxi-ride-fare-processor-00000000000000000000000000000000-jobmanager-leader}} > {{ uid: 9f912495-382a-45de-a789-fd5ad2a2459d}} -- This message was sent by Atlassian Jira (v8.3.4#803005)