[
https://issues.apache.org/jira/browse/FLINK-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899461#comment-15899461
]
ASF GitHub Bot commented on FLINK-5971:
---------------------------------------
GitHub user tillrohrmann opened a pull request:
https://github.com/apache/flink/pull/3488
[FLINK-5971] [flip-6] Add timeout for registered jobs on the ResourceManager
This PR introduces a timeout for inactive jobs on the ResourceManager. A
job is inactive
if there is no active leader known for this job. In case that a job times
out, it will
be removed from the ResourceManager. Additionally, this PR removes the
dependency of
the JobLeaderIdService on the RunningJobsRegistry.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tillrohrmann/flink
jobLeaderIdServiceTimeoutJobs
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/3488.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3488
----
commit 6fb0e239921ca10bac218d62155b08fb96e3725d
Author: Till Rohrmann <[email protected]>
Date: 2017-03-06T15:57:43Z
[FLINK-5971] [flip-6] Add timeout for registered jobs on the ResourceManager
This PR introduces a timeout for inactive jobs on the ResourceManager. A
job is inactive
if there is no active leader known for this job. In case that a job times
out, it will
be removed from the ResourceManager. Additionally, this PR removes the
dependency of
the JobLeaderIdService on the RunningJobsRegistry.
----
> JobLeaderIdService should time out registered jobs
> --------------------------------------------------
>
> Key: FLINK-5971
> URL: https://issues.apache.org/jira/browse/FLINK-5971
> Project: Flink
> Issue Type: Bug
> Components: Distributed Coordination
> Affects Versions: 1.3.0
> Reporter: Till Rohrmann
> Assignee: Till Rohrmann
> Labels: flip-6
>
> The {{JobLeaderIdService}} has no mechanism to time out inactive jobs. At the
> moment it relies on the {{RunningJobsRegistry}} which only gives a heuristic
> answer.
> We should remove the {{RunningJobsRegistry}} and register instead a timeout
> for each job which does not have a job leader associated.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)