GitHub user rmetzger opened a pull request:

    https://github.com/apache/flink/pull/468

     [FLINK-1629][FLINK-1630][FLINK-1547] Rework Flink on YARN

    The main change here is a reworked container scheduling logic in the YARN 
ApplicationMaster.
    
    This commit is changing:
    [FLINK-1629]: users can now "fire and forget" jobs to YARN or YARN sessions 
to there. (Detached mode)
    [FLINK-1630]: YARN is now reallocating failed YARN containers during the 
lifetime of a YARN session.
    [FLINK-1547]: Users can now specify if they want the ApplicationMaster (= 
the JobManager = the entire YARN session) to restart on failure, and how often. 
After the first restart, the session will behave like a detached session. There 
is now backup of state between the old and the new AM.
    
     The whole resource negotiation process between the RM and the AM has been 
reworked.
     Flink is now much more flexible when requesting new containers and also 
giving back uneeded containers.
    
    A new test case is testing the container restart. It is also verifying that 
the web frontend is proplery started, that the logfile access is possible and 
that the configuration values the user specifies when starting the YARN session 
are visible in the web frontend.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rmetzger/flink flink-1630-rebased-final

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/468.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #468
    
----
commit ee02f92f609b2fb300c3e5af9bf75ea0745dff3b
Author: Robert Metzger <[email protected]>
Date:   2015-03-05T14:03:05Z

     [FLINK-1629][FLINK-1630][FLINK-1547] Add option to start Flink on YARN in 
a detached mode. YARN container reallocation.
    
        This commit is changing:
        [FLINK-1629]: users can now "fire and forget" jobs to YARN or YARN 
sessions to there. (Detached mode)
        [FLINK-1630]: YARN is now reallocating failed YARN containers during 
the lifetime of a YARN session.
        [FLINK-1547]: Users can now specify if they want the ApplicationMaster 
(= the JobManager = the entire YARN session) to restart on failure, and how 
often. After the first restart, the session will behave like a detached 
session. There is now backup of state between the old and the new AM.
    
        The whole resource negotiation process between the RM and the AM has 
been reworked.
        Flink is now much more flexible when requesting new containers and also 
giving back uneeded containers.
    
        A new test case is testing the container restart. It is also verifying 
that the web frontend is proplery started,
        that the logfile access is possible and
        that the configuration values the user specifies when starting the YARN 
session are visible in the web frontend.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to