[
https://issues.apache.org/jira/browse/FLINK-25865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17531499#comment-17531499
]
Aitozi commented on FLINK-25865:
--------------------------------
Hi [~wangyang0918] are you working on this now ? If not, I would like to work
on this.
> Support to set restart policy of TaskManager pod for native K8s integration
> ---------------------------------------------------------------------------
>
> Key: FLINK-25865
> URL: https://issues.apache.org/jira/browse/FLINK-25865
> Project: Flink
> Issue Type: Improvement
> Components: Deployment / Kubernetes
> Reporter: Yang Wang
> Priority: Major
>
> After FLIP-201, Flink's TaskManagers will be able to be restarted without
> losing its local state. So it is reasonable to make the restart policy[1] of
> TaskManager pod could be configured.
> The current restart policy is {{{}Never{}}}. Flink will always delete the
> failed TaskManager pod directly and create a new one instead. This ticket
> could help to decrease the recovery time of TaskManager failure.
>
> Please note that the working directory needs to be located in the
> emptyDir[1], which is retained in different restarts.
>
> [1].
> https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy
> [2]. https://kubernetes.io/docs/concepts/storage/volumes/#emptydir
--
This message was sent by Atlassian Jira
(v8.20.7#820007)