Till Rohrmann created FLINK-24038:
-------------------------------------
Summary: DispatcherResourceManagerComponent fails to deregister
application if no leading ResourceManager
Key: FLINK-24038
URL: https://issues.apache.org/jira/browse/FLINK-24038
Project: Flink
Issue Type: Bug
Components: Runtime / Coordination
Affects Versions: 1.14.0
Reporter: Till Rohrmann
Fix For: 1.14.0
With FLINK-21667 we introduced a change that can cause the
{{DispatcherResourceManagerComponent}} to fail when trying to stop the
application. The problem is that the {{DispatcherResourceManagerComponent}}
needs a leading {{ResourceManager}} to successfully execute the stop/deregister
application call. If this is not the case, then it will fail fatally. In the
case of multiple standby JobManager processes it can happen that the leading
{{ResourceManager}} runs somewhere else.
I do see two possible solutions:
1. Run the leader election process for the whole JobManager process
2. Move the registration/deregistration of the application out of the
{{ResourceManager}} so that it can be executed w/o a leader
--
This message was sent by Atlassian Jira
(v8.3.4#803005)