[jira] [Comment Edited] (FLINK-16279) Per job Yarn application leak in normal execution mode.

Till Rohrmann (Jira) Fri, 28 Feb 2020 04:43:40 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-16279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046100#comment-17046100
 ]


Till Rohrmann edited comment on FLINK-16279 at 2/28/20 12:42 PM:
-----------------------------------------------------------------

[~wenlong.lwl] Do you start the Yarn per-job cluster in attached/detached mode?

If it is the detached mode, i think the Flink cluster will always be destroyed 
after the job goes in terminal state. For the attached mode, when the client 
exits, i think we could not guarantee that the Flink cluster always could be 
cleaned up successfully. The cli option {{\-sae,\-\-shutdownOnAttachedExit}} 
could be used to make a best-effort clean-up.


was (Author: fly_in_gis):
[~wenlong.lwl] Do you start the Yarn per-job cluster in attached/detached mode?

If it is the detached mode, i think the Flink cluster will always be destroyed 
after the job goes in terminal state. For the attached mode, when the client 
exits, i think we could not guarantee that the Flink cluster always could be 
cleaned up successfully. The cli option {{-sae,--shutdownOnAttachedExit}} could 
be used to make a best-effort clean-up.

> Per job Yarn application leak in normal execution mode.
> -------------------------------------------------------
>
>                 Key: FLINK-16279
>                 URL: https://issues.apache.org/jira/browse/FLINK-16279
>             Project: Flink
>          Issue Type: Bug
>          Components: Client / Job Submission, Runtime / Coordination
>    Affects Versions: 1.10.0
>            Reporter: Wenlong Lyu
>            Priority: Major
>
> I run a job in yarn per job mode using {{env.executeAsync}}, the job failed 
> but the yarn cluster didn't be destroyed.
> After some research on the code, I found that:
> when running in attached mode, MiniDispatcher will never set 
> {{shutDownfuture}} before received a request from job client. 
> {code}
>               if (executionMode == ClusterEntrypoint.ExecutionMode.NORMAL) {
>                       // terminate the MiniDispatcher once we served the 
> first JobResult successfully
>                       jobResultFuture.thenAccept((JobResult result) -> {
>                               ApplicationStatus status = 
> result.getSerializedThrowable().isPresent() ?
>                                               ApplicationStatus.FAILED : 
> ApplicationStatus.SUCCEEDED;
>                               LOG.debug("Shutting down per-job cluster 
> because someone retrieved the job result.");
>                               shutDownFuture.complete(status);
>                       });
>               } 
> {code}
> However, when running in async mode(submit job by env.executeAsync), there 
> may be no request from job client because when a user find that the job is 
> failed from job client, he may never request the result again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-16279) Per job Yarn application leak in normal execution mode.

Reply via email to