[jira] [Commented] (FLINK-26139) Improve JobStatus tracking and handle different job states

Gyula Fora (Jira) Sun, 27 Feb 2022 06:06:07 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-26139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498596#comment-17498596
 ]


Gyula Fora commented on FLINK-26139:
------------------------------------

CANCELLED: I think the only question here is what to do with manual user 
cancellation of a job. This will mostly manifest itself as a missing deployment 
(as the cluster self destructs currently - might change in the future). This 
will also remove checkpoint metadata so I think it is very tricky to restore a 
cancelled job in a general case while respecting the upgrade policy. The more I 
look at it the more I think this should be an terminal ERROR state (no upgrade 
possible).

FINISHED: In theory jobs can actually finish (bounded sources). The problem 
might be that we cannot really tell that a job finished or cancelled because 
same as with cancelled, the cluster will self destruct, HA metadata will be 
deleted etc. 

So to recap, since terminal job states will shut down the cluster and delete HA 
data, we don't really have any choice now other than move to ERROR or delete 
the resource or introduce an UNKNOWN state. Later if we force the cluster to 
stay alive (https://issues.apache.org/jira/browse/FLINK-24113) we need to 
revisit this.

> Improve JobStatus tracking and handle different job states
> ----------------------------------------------------------
>
>                 Key: FLINK-26139
>                 URL: https://issues.apache.org/jira/browse/FLINK-26139
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Kubernetes Operator
>            Reporter: Gyula Fora
>            Priority: Major
>         Attachments: image-2022-02-25-21-22-08-636.png
>
>
> Currently we do not handle any job status changes such as cancellations, 
> errors or job completions.
> We should introduce some mechanism to react and deal with these changes and 
> expose them in the status as they can potentially affect upgrades.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-26139) Improve JobStatus tracking and handle different job states

Reply via email to