[ 
https://issues.apache.org/jira/browse/FLINK-22262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628320#comment-17628320
 ] 

Yun Tang commented on FLINK-22262:
----------------------------------

[~wangyang0918] I come across a rare case:
When the jobmanager pod deletes the deployment on job cancelation, it suddenly 
restarts due to some reason. Thus the job would submit again from previous 
savepoint and create new HA related configmaps with the restoring savepoint 
just as the job started again. After a while, since the deployment has been 
deleted, the job manager would finally be deleted and no taskmanagers could be 
created. However, those HA related configmaps left behind due to not having 
OwnerReference.
Then user submit the job again, however, since the left HA related configmaps, 
the job would resume from previous savepoints, which leads to incorrect job 
state.
I think offering options to let HA related configmaps have OwnerReference with 
deployment is reasonable in some cases.
Or do you have some suggestions to walk around this problem?

> Flink on Kubernetes ConfigMaps are created without OwnerReference
> -----------------------------------------------------------------
>
>                 Key: FLINK-22262
>                 URL: https://issues.apache.org/jira/browse/FLINK-22262
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / Kubernetes
>    Affects Versions: 1.13.0
>            Reporter: Andrea Peruffo
>            Priority: Not a Priority
>              Labels: auto-deprioritized-major, auto-deprioritized-minor
>         Attachments: jm.log
>
>
> According to the documentation:
> [https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#manual-resource-cleanup]
> The ConfigMaps created along with the Flink deployment is supposed to have an 
> OwnerReference pointing to the Deployment itself, unfortunately, this doesn't 
> happen and causes all sorts of issues when the classpath and the jars of the 
> job are updated.
> i.e.:
> Without manually removing the ConfigMap of the Job I cannot update the Jars 
> of the Job.
> Can you please give guidance if there are additional caveats on manually 
> removing the ConfigMap? Any other workaround that can be used?
> Thanks in advance.
> Example ConfigMap:
> {{apiVersion: v1}}
> {{data:}}
> {{ address: akka.tcp://flink@10.0.2.13:6123/user/rpc/jobmanager_2}}
> {{ checkpointID-0000000000000000049: 
> rO0ABXNyADtvcmcuYXBhY2hlLmZsaW5rLnJ1bnRpbWUuc3RhdGUuUmV0cmlldmFibGVTdHJlYW1TdGF0ZUhhbmRsZQABHhjxVZcrAgABTAAYd3JhcHBlZFN0cmVhbVN0YXRlSGFuZGxldAAyTG9yZy9hcGFjaGUvZmxpbmsvcnVudGltZS9zdGF0ZS9TdHJlYW1TdGF0ZUhhbmRsZTt4cHNyADlvcmcuYXBhY2hlLmZsaW5rLnJ1bnRpbWUuc3RhdGUuZmlsZXN5c3RlbS5GaWxlU3RhdGVIYW5kbGUE3HXYYr0bswIAAkoACXN0YXRlU2l6ZUwACGZpbGVQYXRodAAfTG9yZy9hcGFjaGUvZmxpbmsvY29yZS9mcy9QYXRoO3hwAAAAAAABOEtzcgAdb3JnLmFwYWNoZS5mbGluay5jb3JlLmZzLlBhdGgAAAAAAAAAAQIAAUwAA3VyaXQADkxqYXZhL25ldC9VUkk7eHBzcgAMamF2YS5uZXQuVVJJrAF4LkOeSasDAAFMAAZzdHJpbmd0ABJMamF2YS9sYW5nL1N0cmluZzt4cHQAUC9tbnQvZmxpbmsvc3RvcmFnZS9rc2hhL3RheGktcmlkZS1mYXJlLXByb2Nlc3Nvci9jb21wbGV0ZWRDaGVja3BvaW50MDQ0YTc2OWRkNDgxeA==}}
> {{ counter: "50"}}
> {{ sessionId: 0c2b69ee-6b41-48d3-b7fd-1bf2eda94f0f}}
> {{kind: ConfigMap}}
> {{metadata:}}
> {{ annotations:}}
> {{ control-plane.alpha.kubernetes.io/leader: 
> '\{"holderIdentity":"0f25a2cc-e212-46b0-8ba9-faac0732a316","leaseDuration":15.000000000,"acquireTime":"2021-04-13T14:30:51.439000Z","renewTime":"2021-04-13T14:39:32.011000Z","leaderTransitions":105}'}}
> {{ creationTimestamp: "2021-04-13T14:30:51Z"}}
> {{ labels:}}
> {{ app: taxi-ride-fare-processor}}
> {{ configmap-type: high-availability}}
> {{ type: flink-native-kubernetes}}
> {{ name: 
> taxi-ride-fare-processor-00000000000000000000000000000000-jobmanager-leader}}
> {{ namespace: taxi-ride-fare}}
> {{ resourceVersion: "64100"}}
> {{ selfLink: 
> /api/v1/namespaces/taxi-ride-fare/configmaps/taxi-ride-fare-processor-00000000000000000000000000000000-jobmanager-leader}}
> {{ uid: 9f912495-382a-45de-a789-fd5ad2a2459d}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to