[ 
https://issues.apache.org/jira/browse/FLINK-18828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186327#comment-17186327
 ] 

Ufuk Celebi commented on FLINK-18828:
-------------------------------------

[~fly_in_gis] I think it makes sense to keep a non-zero exit code for failed 
jobs. How would users figure out whether the job has succeeded or not if we 
change the exit code?

Regarding the unexpected restarts: What about updating the {{restartPolicy}} to 
{{Never}} in the spec of the Kubernetes Job 
(https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy)?
 That way, we would still have the information from the exit code and we 
wouldn't see any restarts by default. Users would also have the flexibility to 
change the behaviour depending on their use case by setting {{restartPolicy: 
OnFailure}} again.

 

> Terminate jobmanager process with zero exit code to avoid unexpected 
> restarting by K8s
> --------------------------------------------------------------------------------------
>
>                 Key: FLINK-18828
>                 URL: https://issues.apache.org/jira/browse/FLINK-18828
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.10.1, 1.12.0, 1.11.1
>            Reporter: Yang Wang
>            Priority: Major
>             Fix For: 1.12.0, 1.11.2, 1.10.3
>
>
> Currently, Flink jobmanager process terminates with a non-zero exit code if 
> the job reaches the {{ApplicationStatus.FAILED}}. It is not ideal in K8s 
> deployment, since non-zero exit code will cause unexpected restarting. Also 
> from a framework's perspective, a FAILED job does not mean that Flink has 
> failed and, hence, the return code could still be 0.
> > Note:
> This is a special case for standalone K8s deployment. For 
> standalone/Yarn/Mesos/native K8s, terminating with non-zero exit code is 
> harmless. And a non-zero exit code could help to check the job result quickly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to