[
https://issues.apache.org/jira/browse/FLINK-26772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17525443#comment-17525443
]
Matthias Pohl commented on FLINK-26772:
---------------------------------------
The problem is that we're waiting for the jobs (and their cleanup) to terminate
in {{Dispatcher#onStop}}. But we're not waiting for this to be completed during
the shutdown. The logs of the standalone run reveal it through the "{{Stopped
dispatcher [...]}}" log message which is triggered after the termination is
completed.
> Application Mode does not wait for job cleanup during shutdown
> --------------------------------------------------------------
>
> Key: FLINK-26772
> URL: https://issues.apache.org/jira/browse/FLINK-26772
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.15.0
> Reporter: Mika Naylor
> Assignee: Matthias Pohl
> Priority: Critical
> Labels: pull-request-available
> Attachments: FLINK-26772.standalone-job.log,
> testcluster-599f4d476b-bghw5_log.txt
>
>
> We discovered that in Application Mode, when the application has completed,
> the cluster is shutdown even if there are ongoing resource cleanup events
> happening in the background. For example, if ha cleanup fails, further
> retries are not attempted as the cluster is shut down before this can happen.
>
> We should also add a flag for the shutdown that will prevent further jobs
> from being submitted.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)