[jira] [Commented] (FLINK-14010) Dispatcher & JobManagers don't give up leadership when AM is shut down

2019-10-10 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948730#comment-16948730
 ] 

Till Rohrmann commented on FLINK-14010:
---

I think we introduced a Yarn test instability with this fix. FLINK-14347 looks 
as if we are reacting to an {{onShutdownRequest}} during the test clean up 
phase. Since we are calling the {{FatalExceptionHandler}} we terminate the 
process and log a fatal error in the logs which fails the test.

> Dispatcher & JobManagers don't give up leadership when AM is shut down
> --
>
> Key: FLINK-14010
> URL: https://issues.apache.org/jira/browse/FLINK-14010
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.7.2, 1.8.2, 1.9.0, 1.10.0
>Reporter: Zili Chen
>Assignee: Zili Chen
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.10.0, 1.9.1, 1.8.3
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In YARN deployment scenario, YARN RM possibly launches a new AM for the job 
> even if the previous AM does not terminated, for example, when AMRM heartbeat 
> timeout. This is a common case that RM will send a shutdown request to the 
> previous AM and expect the AM shutdown properly.
> However, currently in {{YARNResourceManager}}, we handle this request in 
> {{onShutdownRequest}} which simply close the {{YARNResourceManager}} *but not 
> Dispatcher and JobManagers*. Thus, Dispatcher and JobManager launched in new 
> AM cannot be granted leadership properly. Visually,
> on previous AM: Dispatcher leader, JM leaders
> on new AM: ResourceManager leader
> since on client side or in per-job mode, JobManager address and port are 
> configured as the new AM, the whole cluster goes into an unrecoverable 
> inconsistent status: client all queries the dispatcher on new AM who is now 
> the leader. Briefly, Dispatcher and JobManagers on previous AM do not give up 
> their leadership properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14010) Dispatcher & JobManagers don't give up leadership when AM is shut down

2019-09-19 Thread TisonKun (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933140#comment-16933140
 ] 

TisonKun commented on FLINK-14010:
--

Will send a pull request in hours :-)

> Dispatcher & JobManagers don't give up leadership when AM is shut down
> --
>
> Key: FLINK-14010
> URL: https://issues.apache.org/jira/browse/FLINK-14010
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.7.2, 1.8.1, 1.9.0, 1.10.0
>Reporter: TisonKun
>Priority: Critical
>
> In YARN deployment scenario, YARN RM possibly launches a new AM for the job 
> even if the previous AM does not terminated, for example, when AMRM heartbeat 
> timeout. This is a common case that RM will send a shutdown request to the 
> previous AM and expect the AM shutdown properly.
> However, currently in {{YARNResourceManager}}, we handle this request in 
> {{onShutdownRequest}} which simply close the {{YARNResourceManager}} *but not 
> Dispatcher and JobManagers*. Thus, Dispatcher and JobManager launched in new 
> AM cannot be granted leadership properly. Visually,
> on previous AM: Dispatcher leader, JM leaders
> on new AM: ResourceManager leader
> since on client side or in per-job mode, JobManager address and port are 
> configured as the new AM, the whole cluster goes into an unrecoverable 
> inconsistent status: client all queries the dispatcher on new AM who is now 
> the leader. Briefly, Dispatcher and JobManagers on previous AM do not give up 
> their leadership properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14010) Dispatcher & JobManagers don't give up leadership when AM is shut down

2019-09-19 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933139#comment-16933139
 ] 

Till Rohrmann commented on FLINK-14010:
---

Ok, then let's do it like this. Do you have to time to work on this issue 
[~Tison]?

> Dispatcher & JobManagers don't give up leadership when AM is shut down
> --
>
> Key: FLINK-14010
> URL: https://issues.apache.org/jira/browse/FLINK-14010
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.7.2, 1.8.1, 1.9.0, 1.10.0
>Reporter: TisonKun
>Priority: Critical
>
> In YARN deployment scenario, YARN RM possibly launches a new AM for the job 
> even if the previous AM does not terminated, for example, when AMRM heartbeat 
> timeout. This is a common case that RM will send a shutdown request to the 
> previous AM and expect the AM shutdown properly.
> However, currently in {{YARNResourceManager}}, we handle this request in 
> {{onShutdownRequest}} which simply close the {{YARNResourceManager}} *but not 
> Dispatcher and JobManagers*. Thus, Dispatcher and JobManager launched in new 
> AM cannot be granted leadership properly. Visually,
> on previous AM: Dispatcher leader, JM leaders
> on new AM: ResourceManager leader
> since on client side or in per-job mode, JobManager address and port are 
> configured as the new AM, the whole cluster goes into an unrecoverable 
> inconsistent status: client all queries the dispatcher on new AM who is now 
> the leader. Briefly, Dispatcher and JobManagers on previous AM do not give up 
> their leadership properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14010) Dispatcher & JobManagers don't give up leadership when AM is shut down

2019-09-18 Thread TisonKun (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933030#comment-16933030
 ] 

TisonKun commented on FLINK-14010:
--

[~trohrmann] Technically I agree that it is a valid solution. Give it another 
look I think we can complete shutdown future exceptionally "ResourceManager got 
closed when DispatcherResourceManagerComponent is running". It infers that the 
application goes into an UNKNOWN state so that the semantic is also correct.

> Dispatcher & JobManagers don't give up leadership when AM is shut down
> --
>
> Key: FLINK-14010
> URL: https://issues.apache.org/jira/browse/FLINK-14010
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.7.2, 1.8.1, 1.9.0, 1.10.0
>Reporter: TisonKun
>Priority: Critical
>
> In YARN deployment scenario, YARN RM possibly launches a new AM for the job 
> even if the previous AM does not terminated, for example, when AMRM heartbeat 
> timeout. This is a common case that RM will send a shutdown request to the 
> previous AM and expect the AM shutdown properly.
> However, currently in {{YARNResourceManager}}, we handle this request in 
> {{onShutdownRequest}} which simply close the {{YARNResourceManager}} *but not 
> Dispatcher and JobManagers*. Thus, Dispatcher and JobManager launched in new 
> AM cannot be granted leadership properly. Visually,
> on previous AM: Dispatcher leader, JM leaders
> on new AM: ResourceManager leader
> since on client side or in per-job mode, JobManager address and port are 
> configured as the new AM, the whole cluster goes into an unrecoverable 
> inconsistent status: client all queries the dispatcher on new AM who is now 
> the leader. Briefly, Dispatcher and JobManagers on previous AM do not give up 
> their leadership properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14010) Dispatcher & JobManagers don't give up leadership when AM is shut down

2019-09-18 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932676#comment-16932676
 ] 

Till Rohrmann commented on FLINK-14010:
---

Can't we say that we always complete 
{{DispatcherResourceManagerComponent#shutDownFuture}} exceptionally if 
{{ResourceManager.getTerminationFuture()}} terminates while 
{{DispatcherResourceManagerComponent#isRunning}} is {{true}}? The contract 
could be that the {{ResourceManager}} always needs to run and if it stops, then 
this is an indicator that something went wrong and we should stop the 
{{ClusterEntrypoint}}. We could do this by completing 
{{DispatcherResourceManagerComponent#shutDownFuture}} exceptionally if 
{{DispatcherResourceManagerComponent#isRunning}} is {{true}}. However, one 
could similarly also simply call {{onFatalError}} from within the 
{{ResourceManager}} as you've initially proposed.

> Dispatcher & JobManagers don't give up leadership when AM is shut down
> --
>
> Key: FLINK-14010
> URL: https://issues.apache.org/jira/browse/FLINK-14010
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.7.2, 1.8.1, 1.9.0, 1.10.0
>Reporter: TisonKun
>Priority: Critical
>
> In YARN deployment scenario, YARN RM possibly launches a new AM for the job 
> even if the previous AM does not terminated, for example, when AMRM heartbeat 
> timeout. This is a common case that RM will send a shutdown request to the 
> previous AM and expect the AM shutdown properly.
> However, currently in {{YARNResourceManager}}, we handle this request in 
> {{onShutdownRequest}} which simply close the {{YARNResourceManager}} *but not 
> Dispatcher and JobManagers*. Thus, Dispatcher and JobManager launched in new 
> AM cannot be granted leadership properly. Visually,
> on previous AM: Dispatcher leader, JM leaders
> on new AM: ResourceManager leader
> since on client side or in per-job mode, JobManager address and port are 
> configured as the new AM, the whole cluster goes into an unrecoverable 
> inconsistent status: client all queries the dispatcher on new AM who is now 
> the leader. Briefly, Dispatcher and JobManagers on previous AM do not give up 
> their leadership properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14010) Dispatcher & JobManagers don't give up leadership when AM is shut down

2019-09-18 Thread TisonKun (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932423#comment-16932423
 ] 

TisonKun commented on FLINK-14010:
--

I have thought of this. The problem is that when the situation described here 
happens, we actually complete {{ResourceManager#getTerminationFuture}} 
normally, which cannot be sourced that it comes from 
{{YarnResourceManager#onShutdownRequest}}.

If we achieve the function by using {{ResourceManager#getTerminationFuture}} to 
trigger the shut down of the {{DispatcherResourceManagerComponent}}, the 
assumption is:

If ResourceManager is closed first(since termination future completes normally 
in both cases, we cannot distinguish by {{whenComplete}}), it infers an 
exceptionally status so that we should complete 
{{DispatcherResourceManagerComponent#getShutDownFuture}} exceptionally. 
Otherwise ResourceManager closes normally by other triggers, and the either 
{{DispatcherResourceManagerComponent#getShutDownFuture}} is already completed 
or {{ClusterEntrypoint#shutdownAsync}} is guarded to be executed once.

I think this assumption is counter-intuitive that ResourceManager terminates 
"normally" but we complete shutdownFuture exceptionally.

> Dispatcher & JobManagers don't give up leadership when AM is shut down
> --
>
> Key: FLINK-14010
> URL: https://issues.apache.org/jira/browse/FLINK-14010
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.7.2, 1.8.1, 1.9.0, 1.10.0
>Reporter: TisonKun
>Priority: Critical
>
> In YARN deployment scenario, YARN RM possibly launches a new AM for the job 
> even if the previous AM does not terminated, for example, when AMRM heartbeat 
> timeout. This is a common case that RM will send a shutdown request to the 
> previous AM and expect the AM shutdown properly.
> However, currently in {{YARNResourceManager}}, we handle this request in 
> {{onShutdownRequest}} which simply close the {{YARNResourceManager}} *but not 
> Dispatcher and JobManagers*. Thus, Dispatcher and JobManager launched in new 
> AM cannot be granted leadership properly. Visually,
> on previous AM: Dispatcher leader, JM leaders
> on new AM: ResourceManager leader
> since on client side or in per-job mode, JobManager address and port are 
> configured as the new AM, the whole cluster goes into an unrecoverable 
> inconsistent status: client all queries the dispatcher on new AM who is now 
> the leader. Briefly, Dispatcher and JobManagers on previous AM do not give up 
> their leadership properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14010) Dispatcher & JobManagers don't give up leadership when AM is shut down

2019-09-18 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932396#comment-16932396
 ] 

Till Rohrmann commented on FLINK-14010:
---

Couldn't we simply use {{ResourceManager#getTerminationFuture}} to trigger the 
shut down of the {{DispatcherResourceManagerComponent}} [~Tison]?

> Dispatcher & JobManagers don't give up leadership when AM is shut down
> --
>
> Key: FLINK-14010
> URL: https://issues.apache.org/jira/browse/FLINK-14010
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.7.2, 1.8.1, 1.9.0, 1.10.0
>Reporter: TisonKun
>Priority: Critical
>
> In YARN deployment scenario, YARN RM possibly launches a new AM for the job 
> even if the previous AM does not terminated, for example, when AMRM heartbeat 
> timeout. This is a common case that RM will send a shutdown request to the 
> previous AM and expect the AM shutdown properly.
> However, currently in {{YARNResourceManager}}, we handle this request in 
> {{onShutdownRequest}} which simply close the {{YARNResourceManager}} *but not 
> Dispatcher and JobManagers*. Thus, Dispatcher and JobManager launched in new 
> AM cannot be granted leadership properly. Visually,
> on previous AM: Dispatcher leader, JM leaders
> on new AM: ResourceManager leader
> since on client side or in per-job mode, JobManager address and port are 
> configured as the new AM, the whole cluster goes into an unrecoverable 
> inconsistent status: client all queries the dispatcher on new AM who is now 
> the leader. Briefly, Dispatcher and JobManagers on previous AM do not give up 
> their leadership properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14010) Dispatcher & JobManagers don't give up leadership when AM is shut down

2019-09-17 Thread TisonKun (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931490#comment-16931490
 ] 

TisonKun commented on FLINK-14010:
--

Well, it's reasonable we try to gracefully shut down. I start to work on it but 
I'm not sure about what the future should look like.

There are two options in my mind, both of which introduce a {{shutdownFuture}} 
in {{ResourceManager}}.

1. {{ResourceManager#shutdownFuture}} is completed on 
{{YarnResourceManager#onShutdownRequest}} gets called. And we register callback 
in {{DispatcherResourceManagerComponent#registerShutDownFuture}}, when 
{{ResourceManager#shutdownFuture}} complete, we complete 
{{DispatcherResourceManagerComponent#shutDownFuture}} exceptionally. Concern 
here is that {{ResourceManager#shutdownFuture}} is never completed if 
{{YarnResourceManager#onShutdownRequest}} never gets called. I'm not sure if it 
is well.

2. {{ResourceManager#shutdownFuture}} is completed normally on 
{{ResourceManager#stopResourceManagerServices}} gets called, while completed 
exceptionally on {{YarnResourceManager#onShutdownRequest}} gets called. Also we 
register callback in 
{{DispatcherResourceManagerComponent#registerShutDownFuture}}, when 
{{ResourceManager#shutdownFuture}} complete exceptionally, we complete 
{{DispatcherResourceManagerComponent#shutDownFuture}} exceptionally; when when 
{{ResourceManager#shutdownFuture}} complete normally we do nothing. It might be 
a bit more complex than 1 and we should ensure that codepaths 
{{ResourceManager}} exit are all covered.

WDYT [~till.rohrmann]?

> Dispatcher & JobManagers don't give up leadership when AM is shut down
> --
>
> Key: FLINK-14010
> URL: https://issues.apache.org/jira/browse/FLINK-14010
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.7.2, 1.8.1, 1.9.0, 1.10.0
>Reporter: TisonKun
>Priority: Critical
>
> In YARN deployment scenario, YARN RM possibly launches a new AM for the job 
> even if the previous AM does not terminated, for example, when AMRM heartbeat 
> timeout. This is a common case that RM will send a shutdown request to the 
> previous AM and expect the AM shutdown properly.
> However, currently in {{YARNResourceManager}}, we handle this request in 
> {{onShutdownRequest}} which simply close the {{YARNResourceManager}} *but not 
> Dispatcher and JobManagers*. Thus, Dispatcher and JobManager launched in new 
> AM cannot be granted leadership properly. Visually,
> on previous AM: Dispatcher leader, JM leaders
> on new AM: ResourceManager leader
> since on client side or in per-job mode, JobManager address and port are 
> configured as the new AM, the whole cluster goes into an unrecoverable 
> inconsistent status: client all queries the dispatcher on new AM who is now 
> the leader. Briefly, Dispatcher and JobManagers on previous AM do not give up 
> their leadership properly.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (FLINK-14010) Dispatcher & JobManagers don't give up leadership when AM is shut down

2019-09-17 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931386#comment-16931386
 ] 

Till Rohrmann commented on FLINK-14010:
---

{{#onFatalError}} could also be an option but I would prefer to distinguish 
here. I would consider {{#onShutdownRequest}} as request and not an error case. 
Hence, I would suggest to try to gracefully shut down. If this does not work, 
then we could fail fatally.

> Dispatcher & JobManagers don't give up leadership when AM is shut down
> --
>
> Key: FLINK-14010
> URL: https://issues.apache.org/jira/browse/FLINK-14010
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.7.2, 1.8.1, 1.9.0, 1.10.0
>Reporter: TisonKun
>Priority: Critical
>
> In YARN deployment scenario, YARN RM possibly launches a new AM for the job 
> even if the previous AM does not terminated, for example, when AMRM heartbeat 
> timeout. This is a common case that RM will send a shutdown request to the 
> previous AM and expect the AM shutdown properly.
> However, currently in {{YARNResourceManager}}, we handle this request in 
> {{onShutdownRequest}} which simply close the {{YARNResourceManager}} *but not 
> Dispatcher and JobManagers*. Thus, Dispatcher and JobManager launched in new 
> AM cannot be granted leadership properly. Visually,
> on previous AM: Dispatcher leader, JM leaders
> on new AM: ResourceManager leader
> since on client side or in per-job mode, JobManager address and port are 
> configured as the new AM, the whole cluster goes into an unrecoverable 
> inconsistent status: client all queries the dispatcher on new AM who is now 
> the leader. Briefly, Dispatcher and JobManagers on previous AM do not give up 
> their leadership properly.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (FLINK-14010) Dispatcher & JobManagers don't give up leadership when AM is shut down

2019-09-17 Thread TisonKun (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931287#comment-16931287
 ] 

TisonKun commented on FLINK-14010:
--

[~till.rohrmann] Thanks to your explanation, I learn where the components 
layout comes from.

Back to this issue, what if we call {{#onFatalError}} in 
{{YarnResourceManager#onShutdownRequest}}? {{#onShutdownRequest}} is only 
called when AM exceptionally switched and we can regard it as a fatal error. 
For implementation details, it calls {{System.exit}} that correctly shutdown 
the AM and release leadership.

> Dispatcher & JobManagers don't give up leadership when AM is shut down
> --
>
> Key: FLINK-14010
> URL: https://issues.apache.org/jira/browse/FLINK-14010
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.7.2, 1.8.1, 1.9.0, 1.10.0
>Reporter: TisonKun
>Priority: Critical
>
> In YARN deployment scenario, YARN RM possibly launches a new AM for the job 
> even if the previous AM does not terminated, for example, when AMRM heartbeat 
> timeout. This is a common case that RM will send a shutdown request to the 
> previous AM and expect the AM shutdown properly.
> However, currently in {{YARNResourceManager}}, we handle this request in 
> {{onShutdownRequest}} which simply close the {{YARNResourceManager}} *but not 
> Dispatcher and JobManagers*. Thus, Dispatcher and JobManager launched in new 
> AM cannot be granted leadership properly. Visually,
> on previous AM: Dispatcher leader, JM leaders
> on new AM: ResourceManager leader
> since on client side or in per-job mode, JobManager address and port are 
> configured as the new AM, the whole cluster goes into an unrecoverable 
> inconsistent status: client all queries the dispatcher on new AM who is now 
> the leader. Briefly, Dispatcher and JobManagers on previous AM do not give up 
> their leadership properly.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (FLINK-14010) Dispatcher & JobManagers don't give up leadership when AM is shut down

2019-09-09 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16925778#comment-16925778
 ] 

Till Rohrmann commented on FLINK-14010:
---

Thanks for reporting this issue [~Tison]. I think your analysis is correct that 
we currently do not react properly to {{onShutdownRequest}} signals. What 
happens is that we only terminate the {{ResourceManager}} but not the other 
Flink components ({{Dispatcher}} and {{JobMasters}}).

Before looking into a potential solutions let's answer your questions because 
they are important for understanding the bigger picture.

The two components {{Dispatcher}} and {{ResourceManager}} have been designed to 
run independently of the other component, potentially also in a separate 
process. The current implementation couples the {{Dispatcher}} with the 
{{JobMasters}} but there should be nothing fundamental preventing from 
decoupling them. The main thing one needs to do is to add logic to start 
{{JobMaster}} processes somewhere. So long story short, the idea is that all 
components can run independently (modulo the existing {{Dispatcher}} 
implementation).

At the moment we run all these components in one process because it was easier 
and less work to implement it this way. This is reflected in the 
{{DispatcherResourceManagerComponent}}. Hence, for the current implementation 
we can assume that if the {{ResourceManager}} terminates, then the other 
components should shut down as well. This could be implemented by registering 
in the {{DispatcherResourceManagerComponent}} a callback on the 
{{ResourceManager's}} termination future which triggers the closing of the 
{{DispatcherResourceManagerComponent}}.

Concerning using a single leader election service, I would say that this is not 
feasible if we still want to keep the option to deploy the components in 
separate processes eventually.

Would you have time to work on the fix for this problem [~Tison]?

> Dispatcher & JobManagers don't give up leadership when AM is shut down
> --
>
> Key: FLINK-14010
> URL: https://issues.apache.org/jira/browse/FLINK-14010
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.7.2, 1.8.1, 1.9.0, 1.10.0
>Reporter: TisonKun
>Priority: Critical
>
> In YARN deployment scenario, YARN RM possibly launches a new AM for the job 
> even if the previous AM does not terminated, for example, when AMRM heartbeat 
> timeout. This is a common case that RM will send a shutdown request to the 
> previous AM and expect the AM shutdown properly.
> However, currently in {{YARNResourceManager}}, we handle this request in 
> {{onShutdownRequest}} which simply close the {{YARNResourceManager}} *but not 
> Dispatcher and JobManagers*. Thus, Dispatcher and JobManager launched in new 
> AM cannot be granted leadership properly. Visually,
> on previous AM: Dispatcher leader, JM leaders
> on new AM: ResourceManager leader
> since on client side or in per-job mode, JobManager address and port are 
> configured as the new AM, the whole cluster goes into an unrecoverable 
> inconsistent status: client all queries the dispatcher on new AM who is now 
> the leader. Briefly, Dispatcher and JobManagers on previous AM do not give up 
> their leadership properly.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (FLINK-14010) Dispatcher & JobManagers don't give up leadership when AM is shut down

2019-09-09 Thread TisonKun (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16925460#comment-16925460
 ] 

TisonKun commented on FLINK-14010:
--

CC [~StephanEwen] [~till.rohrmann] [~xiaogang.shi]

Here comes a high-level problem, do we explicitly constrain Dispatcher, 
ResourceManager and JobManagers run on one process?

1. the usage of reference to {{JobManagerGateway}} in Dispatcher already infers 
that we require this.
2. back to the design of FLIP-6, we have a global singleton of Dispatcher, and 
for each job, launch a JobManager and ResourceManager. The implementation 
diverges quite a lot. Could you please provide any background?
3. if we explicitly constrain as above, we actually need not to start leader 
election services per components, actually, we can use the abstraction and 
layout as below:

- start a leader election service per dispatcher-resource-manager component, in 
cluster entrypoint level. It will participant the election and all metadata 
commits are delegate to this service.
- all cluster level components that need to publish their address, such as 
Dispatcher, ResourceManager and WebMonitor publish their address via this 
leader election service.
- Actors can be started as {{PermanentlyFencedRpcEndpoint}} and thus we survive 
from handling a lot of mutable shared state among leadership epoch. 
Specifically, cluster entrypoint acts as DispatcherRunner and so on, like 
JobManagerRunner to JobMaster. See also [this 
branch|https://github.com/tillrohrmann/flink/commits/removeSuspendFromJobMaster].

- back to this issue, cluster entrypoint({{YARNClusterEntrypoint}} maybe) 
reacts to AMRM request and thus all components can be required to shutdown 
properly.

> Dispatcher & JobManagers don't give up leadership when AM is shut down
> --
>
> Key: FLINK-14010
> URL: https://issues.apache.org/jira/browse/FLINK-14010
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN, Runtime / Coordination
>Affects Versions: 1.7.2, 1.8.1, 1.9.0, 1.10.0
>Reporter: TisonKun
>Priority: Critical
>
> In YARN deployment scenario, YARN RM possibly launches a new AM for the job 
> even if the previous AM does not terminated, for example, when AMRM heartbeat 
> timeout. This is a common case that RM will send a shutdown request to the 
> previous AM and expect the AM shutdown properly.
> However, currently in {{YARNResourceManager}}, we handle this request in 
> {{onShutdownRequest}} which simply close the {{YARNResourceManager}} *but not 
> Dispatcher and JobManagers*. Thus, Dispatcher and JobManager launched in new 
> AM cannot be granted leadership properly. Visually,
> on previous AM: Dispatcher leader, JM leaders
> on new AM: ResourceManager leader
> since on client side or in per-job mode, JobManager address and port are 
> configured as the new AM, the whole cluster goes into an unrecoverable 
> inconsistent status: client all queries the dispatcher on new AM who is now 
> the leader. Briefly, Dispatcher and JobManagers on previous AM do not give up 
> their leadership properly.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)