[
https://issues.apache.org/jira/browse/FLINK-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353475#comment-14353475
]
ASF GitHub Bot commented on FLINK-1629:
---------------------------------------
GitHub user rmetzger opened a pull request:
https://github.com/apache/flink/pull/468
[FLINK-1629][FLINK-1630][FLINK-1547] Rework Flink on YARN
The main change here is a reworked container scheduling logic in the YARN
ApplicationMaster.
This commit is changing:
[FLINK-1629]: users can now "fire and forget" jobs to YARN or YARN sessions
to there. (Detached mode)
[FLINK-1630]: YARN is now reallocating failed YARN containers during the
lifetime of a YARN session.
[FLINK-1547]: Users can now specify if they want the ApplicationMaster (=
the JobManager = the entire YARN session) to restart on failure, and how often.
After the first restart, the session will behave like a detached session. There
is now backup of state between the old and the new AM.
The whole resource negotiation process between the RM and the AM has been
reworked.
Flink is now much more flexible when requesting new containers and also
giving back uneeded containers.
A new test case is testing the container restart. It is also verifying that
the web frontend is proplery started, that the logfile access is possible and
that the configuration values the user specifies when starting the YARN session
are visible in the web frontend.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rmetzger/flink flink-1630-rebased-final
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/468.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #468
----
commit ee02f92f609b2fb300c3e5af9bf75ea0745dff3b
Author: Robert Metzger <[email protected]>
Date: 2015-03-05T14:03:05Z
[FLINK-1629][FLINK-1630][FLINK-1547] Add option to start Flink on YARN in
a detached mode. YARN container reallocation.
This commit is changing:
[FLINK-1629]: users can now "fire and forget" jobs to YARN or YARN
sessions to there. (Detached mode)
[FLINK-1630]: YARN is now reallocating failed YARN containers during
the lifetime of a YARN session.
[FLINK-1547]: Users can now specify if they want the ApplicationMaster
(= the JobManager = the entire YARN session) to restart on failure, and how
often. After the first restart, the session will behave like a detached
session. There is now backup of state between the old and the new AM.
The whole resource negotiation process between the RM and the AM has
been reworked.
Flink is now much more flexible when requesting new containers and also
giving back uneeded containers.
A new test case is testing the container restart. It is also verifying
that the web frontend is proplery started,
that the logfile access is possible and
that the configuration values the user specifies when starting the YARN
session are visible in the web frontend.
----
> Add option to start Flink on YARN in a detached mode
> ----------------------------------------------------
>
> Key: FLINK-1629
> URL: https://issues.apache.org/jira/browse/FLINK-1629
> Project: Flink
> Issue Type: Improvement
> Components: YARN Client
> Reporter: Robert Metzger
> Assignee: Robert Metzger
>
> Right now, we expect the YARN command line interface to be connected with the
> Application Master all the time to control the "yarn session" or the job.
> For very long running sessions or jobs users want to just "fire and forget" a
> job/session to YARN.
> Stopping the session will still be possible using YARN's tools.
> Also, prior to "detaching" itself, the CLI frontend could print the required
> command to kill the session as a convenience.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)