[ 
https://issues.apache.org/jira/browse/FLINK-22262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17321885#comment-17321885
 ] 

Yang Wang commented on FLINK-22262:
-----------------------------------

Could you please share the JobManager logs when you cancel the Flink job 
successfully and still have the residual ConfigMaps? I think you could use 
{{kubectl logs podname}} to get the logs.

 

I have used the following steps to start/stop Flink applications on K8s with HA 
enable in my minikube. And it works well.

1. Start the native Flink K8a application 

 
{code:java}
$FLINK_HOME/bin/flink run-application -d -t kubernetes-application \
-Dkubernetes.cluster-id=$CLUSTER_ID \
-Dkubernetes.namespace=$NAMESPACE \
-Dkubernetes.container.image=wangyang09180523/flink:1.13.0-rc0 \
-Dkubernetes.container.image.pull-policy=Always \
-Dkubernetes.rest-service.exposed.type=NodePort \
-Dkubernetes.jobmanager.cpu=0.5 -Djobmanager.memory.process.size=1700m \
-Dkubernetes.jobmanager.service-account=default \
-Dkubernetes.taskmanager.cpu=0.5 -Dtaskmanager.memory.process.size=1500m 
-Dtaskmanager.numberOfTaskSlots=4 \
-Dstate.checkpoints.dir=$HA_STORAGE \
-Dhigh-availability=org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
 \
-Dhigh-availability.storageDir=$HA_STORAGE \
-Drestart-strategy=fixed-delay -Drestart-strategy.fixed-delay.attempts=1 \
-Dcontainerized.master.env.ENABLE_BUILT_IN_PLUGINS=flink-oss-fs-hadoop-1.13.0.jar
 
-Dcontainerized.taskmanager.env.ENABLE_BUILT_IN_PLUGINS=flink-oss-fs-hadoop-1.13.0.jar
 \
-Dstate.savepoints.dir=$HA_STORAGE \
local:///opt/flink/examples/streaming/StateMachineExample.jar
{code}
 

2. Cancel the Flink job with savepoint. All the K8s resources will be deleted. 
I do not find residual HA ConfigMaps after canceled successfully.

 
{code:java}
./bin/flink cancel --target kubernetes-application --withSavepoint 
-Dkubernetes.cluster-id=k8s-app-ha-1-113-rc1 -Dkubernetes.namespace=default 
00000000000000000000000000000000
... ...
Cancelled job 00000000000000000000000000000000. Savepoint stored in 
oss://flink-debug-yiqi/flink-ha/savepoint-000000-8741523cb1d1.
{code}
3. Maybe change the user codes and resubmit the Flink application with stored 
savepoint

 
{code:java}
$FLINK_HOME/bin/flink run-application -d -t kubernetes-application \
--fromSavepoint oss://flink-debug-yiqi/flink-ha/savepoint-000000-8741523cb1d1 \
... ...
local:///opt/flink/examples/streaming/StateMachineExample.jar{code}
 

 

> Flink on Kubernetes ConfigMaps are created without OwnerReference
> -----------------------------------------------------------------
>
>                 Key: FLINK-22262
>                 URL: https://issues.apache.org/jira/browse/FLINK-22262
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.13.0
>            Reporter: Andrea Peruffo
>            Priority: Major
>
> According to the documentation:
> [https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#manual-resource-cleanup]
> The ConfigMaps created along with the Flink deployment is supposed to have an 
> OwnerReference pointing to the Deployment itself, unfortunately, this doesn't 
> happen and causes all sorts of issues when the classpath and the jars of the 
> job are updated.
> i.e.:
> Without manually removing the ConfigMap of the Job I cannot update the Jars 
> of the Job.
> Can you please give guidance if there are additional caveats on manually 
> removing the ConfigMap? Any other workaround that can be used?
> Thanks in advance.
> Example ConfigMap:
> {{apiVersion: v1}}
> {{data:}}
> {{ address: akka.tcp://flink@10.0.2.13:6123/user/rpc/jobmanager_2}}
> {{ checkpointID-0000000000000000049: 
> rO0ABXNyADtvcmcuYXBhY2hlLmZsaW5rLnJ1bnRpbWUuc3RhdGUuUmV0cmlldmFibGVTdHJlYW1TdGF0ZUhhbmRsZQABHhjxVZcrAgABTAAYd3JhcHBlZFN0cmVhbVN0YXRlSGFuZGxldAAyTG9yZy9hcGFjaGUvZmxpbmsvcnVudGltZS9zdGF0ZS9TdHJlYW1TdGF0ZUhhbmRsZTt4cHNyADlvcmcuYXBhY2hlLmZsaW5rLnJ1bnRpbWUuc3RhdGUuZmlsZXN5c3RlbS5GaWxlU3RhdGVIYW5kbGUE3HXYYr0bswIAAkoACXN0YXRlU2l6ZUwACGZpbGVQYXRodAAfTG9yZy9hcGFjaGUvZmxpbmsvY29yZS9mcy9QYXRoO3hwAAAAAAABOEtzcgAdb3JnLmFwYWNoZS5mbGluay5jb3JlLmZzLlBhdGgAAAAAAAAAAQIAAUwAA3VyaXQADkxqYXZhL25ldC9VUkk7eHBzcgAMamF2YS5uZXQuVVJJrAF4LkOeSasDAAFMAAZzdHJpbmd0ABJMamF2YS9sYW5nL1N0cmluZzt4cHQAUC9tbnQvZmxpbmsvc3RvcmFnZS9rc2hhL3RheGktcmlkZS1mYXJlLXByb2Nlc3Nvci9jb21wbGV0ZWRDaGVja3BvaW50MDQ0YTc2OWRkNDgxeA==}}
> {{ counter: "50"}}
> {{ sessionId: 0c2b69ee-6b41-48d3-b7fd-1bf2eda94f0f}}
> {{kind: ConfigMap}}
> {{metadata:}}
> {{ annotations:}}
> {{ control-plane.alpha.kubernetes.io/leader: 
> '\{"holderIdentity":"0f25a2cc-e212-46b0-8ba9-faac0732a316","leaseDuration":15.000000000,"acquireTime":"2021-04-13T14:30:51.439000Z","renewTime":"2021-04-13T14:39:32.011000Z","leaderTransitions":105}'}}
> {{ creationTimestamp: "2021-04-13T14:30:51Z"}}
> {{ labels:}}
> {{ app: taxi-ride-fare-processor}}
> {{ configmap-type: high-availability}}
> {{ type: flink-native-kubernetes}}
> {{ name: 
> taxi-ride-fare-processor-00000000000000000000000000000000-jobmanager-leader}}
> {{ namespace: taxi-ride-fare}}
> {{ resourceVersion: "64100"}}
> {{ selfLink: 
> /api/v1/namespaces/taxi-ride-fare/configmaps/taxi-ride-fare-processor-00000000000000000000000000000000-jobmanager-leader}}
> {{ uid: 9f912495-382a-45de-a789-fd5ad2a2459d}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to