Paul Lin created FLINK-22506:
--------------------------------
Summary: YARN job cluster stuck in retrying creating JobManager if
savepoint is corrupted
Key: FLINK-22506
URL: https://issues.apache.org/jira/browse/FLINK-22506
Project: Flink
Issue Type: Bug
Components: Deployment / YARN
Affects Versions: 1.11.3
Reporter: Paul Lin
If a non-retryable error (e.g. the savepoint is corrupted or unaccessible)
occurs during the initiation of the job manager, the job cluster exits with an
error code. But since it does not mark the attempt as failed, it won't be count
as a failed attempt, and YARN will keep retrying forever.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)