[ https://issues.apache.org/jira/browse/FLINK-30244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641162#comment-17641162 ]
Gyula Fora commented on FLINK-30244: ------------------------------------ When using the Native (active) kubernetes integration. Session cluster taskmanagers will be shut down after a certain timeout if the job running on it failed/finished. So if you submit a new job within that time it should reuse the cluster, but after that it will create a new taskmanager > When task using udf/udtf with jni, on k8s session the old TM will shut down > and create new TM or the task will fail > ------------------------------------------------------------------------------------------------------------------- > > Key: FLINK-30244 > URL: https://issues.apache.org/jira/browse/FLINK-30244 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes, Kubernetes Operator, Runtime / > Task > Affects Versions: 1.15.3 > Reporter: AlexHu > Priority: Major > Attachments: image-2022-11-30-14-47-50-923.png, > image-2022-11-30-15-00-06-710.png, image-2022-11-30-15-04-45-696.png, > image-2022-11-30-15-05-29-120.png > > > We face a problem when we try to use flink on k8s to execute task with > udf/udtf. When we finished or canceled a job and submit a same job, the old > TM will be not reachable and restart. Why the TM have to be restart? In > session mode, the TM should be reused by JM. Moreover, if we off restart > strategy, this task will fail. > !image-2022-11-30-14-47-50-923.png! > > First submit, the job will running: > !image-2022-11-30-15-00-06-710.png! > > But, cancel it and submit the same: > !image-2022-11-30-15-04-45-696.png! > Internal server error, but in k8s the pod is running. > !image-2022-11-30-15-05-29-120.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)