[jira] [Commented] (FLINK-8673) Don't let JobManagerRunner shut down itself

2018-02-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368492#comment-16368492
 ] 

ASF GitHub Bot commented on FLINK-8673:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5510


> Don't let JobManagerRunner shut down itself
> ---
>
> Key: FLINK-8673
> URL: https://issues.apache.org/jira/browse/FLINK-8673
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Major
>  Labels: flip-6
> Fix For: 1.5.0
>
>
> Currently, the {{JobManagerRunner}} is allowed to shut down itself in case of 
> a job completion. This, however, can cause problems when the {{Dispatcher}} 
> receives a request for a {{JobMaster}}. If the {{Dispatcher}} is not told 
> about the shut down of the {{JobMaster}} then it might still try to send 
> requests to it. This will lead to time outs.
> It would be better to simply let the {{JobManagerRunner}} not shut down 
> itself and defer it to the owner (the {{Dispatcher}}). We can do this by 
> listening on the {{JobManagerRunner#resultFuture}} which is completed by the 
> {{JobManagerRunner}} in case of a successful job completion or a failure. 
> That way we could also get rid of the {{OnCompletionActions}} and the 
> {{FatalErrorHandler}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8673) Don't let JobManagerRunner shut down itself

2018-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367391#comment-16367391
 ] 

ASF GitHub Bot commented on FLINK-8673:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/5510

[FLINK-8673] [flip6] Use JobManagerRunner#resultFuture for success and 
failure communication

## What is the purpose of the change

This commit removes the OnCompletionActions and FatalErrorHandler from the
JobManagerRunner. Instead it communicates a successful job execution of the
failure case through the JobManagerRunner#resultFuture.

Furthermore, this commit no longer allows the JobManagerRunner to shut down 
itself.
All shut down logic must be triggered by the owner of the JobManagerRunner.

This PR is based on #5494.

## Verifying this change

This change added tests and can be verified as follows: 
`JobManagerRunnerTest`

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  - The S3 file system connector: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink jobManagerRunnerShutdown

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5510.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5510


commit d07221f9c918757a301c37f86ceb72bf5bb2dd0a
Author: gyao 
Date:   2018-02-14T19:47:11Z

[FLINK-7711][flip6] Implement JarListHandler

This closes #5209.
This closes #5455.

commit 4a5dad9388d3ea655e045ec91aa9e1d60774100c
Author: zjureel 
Date:   2017-12-19T09:07:56Z

[FLINK-7857][flip6] Port JobVertexDetailsHandler to REST endpoint

commit 1fdb138fbf8057853008cf80d1ce44acf3af98b6
Author: Till Rohrmann 
Date:   2018-02-15T10:16:12Z

[FLINK-8612] [flip6] Enable non-detached job mode

The non-detached job mode waits until has served the JobResult of
a completed job at least once before it terminates.

This closes #5435.

commit b53051a89e1c03396db25a38e7fe6fb3cb8bf16b
Author: gyao 
Date:   2018-02-15T10:32:15Z

[FLINK-7857][flip6] Return status 404 if JobVertex is unknown

This closes #5493.
This closes #5035.

commit dc98857e06dd70df97e86bd606da2121a6ff21e4
Author: Till Rohrmann 
Date:   2018-02-15T10:37:58Z

[FLINK-8662] [tests] Harden FutureUtilsTest#testRetryWithDelay

This commit moves the start of the time measurement before the triggering of
the retry with delay operation.

This closes #5494.

commit 1ad474f6820f729a9bc7bcdad26a41fd178c025e
Author: Till Rohrmann 
Date:   2018-02-16T14:04:32Z

[FLINK-8673] [flip6] Use JobManagerRunner#resultFuture for success and 
failure communication

This commit removes the OnCompletionActions and FatalErrorHandler from the
JobManagerRunner. Instead it communicates a successful job execution of the
failure case through the JobManagerRunner#resultFuture.

Furthermore, this commit no longer allows the JobManagerRunner to shut down 
itself.
All shut down logic must be triggered by the owner of the JobManagerRunner.




> Don't let JobManagerRunner shut down itself
> ---
>
> Key: FLINK-8673
> URL: https://issues.apache.org/jira/browse/FLINK-8673
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Major
>  Labels: flip-6
> Fix For: 1.5.0
>
>
> Currently, the {{JobManagerRunner}} is allowed to shut down itself in case of 
> a job completion. This, however, can cause problems when the {{Dispatcher}} 
> receives a request for a {{JobMaster}}. If the {{Dispatcher}} is not told 
> about the shut down of the {{JobMaster}} then it might still try to send 
> requests to it. This will lead to time outs.
> It would be better to simply let the {{JobManagerRunner}} not shut down 
> itself and defer it to the owner (the {{Dispatcher}}). We can do this by 
> listening on the