[ 
https://issues.apache.org/jira/browse/FLINK-27802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-27802:
-------------------------------
    Priority: Critical  (was: Blocker)

> Savepoint restore errors are swallowed for Flink 1.15
> -----------------------------------------------------
>
>                 Key: FLINK-27802
>                 URL: https://issues.apache.org/jira/browse/FLINK-27802
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.0.0
>            Reporter: Gyula Fora
>            Priority: Critical
>
> We are currently setting both a result store and the 
> "execution.submit-failed-job-on-application-error" config for HA jobs.
> This leads to swallowed job submission errors that only show up in the result 
> store, but the flink job is not actually displayed in the failed state:
> 2022-05-26 12:34:43,497 WARN 
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Ignoring 
> JobGraph submission 'State machine job' (00000000000000000000000000000000) 
> because the job already reached a globally-terminal state (i.e. FAILED, 
> CANCELED, FINISHED) in a previous execution.
> 2022-05-26 12:34:43,552 INFO 
> org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap 
> [] - Application completed SUCCESSFULLY
> The easiest way to reproduce this is to create a new deployment and set 
> initialSavepointPath to a random missing path.
> I consider this a bug in Flink but we should simply disable the 
> execution.submit-failed-job-on-application-error config.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to