Repository: flink Updated Branches: refs/heads/master 7f9580061 -> d08b18971
[FLINK-4142][docs] Add warning about YARN HA bug This closes #2255 Project: http://git-wip-us.apache.org/repos/asf/flink/repo Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/d08b1897 Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/d08b1897 Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/d08b1897 Branch: refs/heads/master Commit: d08b189715d98c8c29f1c9489184d530bcb2af41 Parents: 7f95800 Author: Robert Metzger <rmetz...@apache.org> Authored: Fri Jul 15 11:20:17 2016 +0200 Committer: zentol <ches...@apache.org> Committed: Fri Jul 15 12:00:48 2016 +0200 ---------------------------------------------------------------------- docs/setup/jobmanager_high_availability.md | 2 ++ 1 file changed, 2 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/flink/blob/d08b1897/docs/setup/jobmanager_high_availability.md ---------------------------------------------------------------------- diff --git a/docs/setup/jobmanager_high_availability.md b/docs/setup/jobmanager_high_availability.md index 5903dcb..c26ce5e 100644 --- a/docs/setup/jobmanager_high_availability.md +++ b/docs/setup/jobmanager_high_availability.md @@ -174,6 +174,8 @@ This means that the application can be restarted 10 times before YARN fails the - **YARN 2.4.0 < version < 2.6.0**. TaskManager containers are kept alive across application master failures. This has the advantage that the startup time is faster and that the user does not have to wait for obtaining the container resources again. - **YARN 2.6.0 <= version**: Sets the attempt failure validity interval to the Flinks' Akka timeout value. The attempt failure validity interval says that an application is only killed after the system has seen the maximum number of application attempts during one interval. This avoids that a long lasting job will deplete it's application attempts. +<p style="border-radius: 5px; padding: 5px" class="bg-danger"><b>Note</b>: Hadoop YARN 2.4.0 has a major bug (fixed in 2.5.0) preventing container restarts from a restarted Application Master/Job Manager container. See <a href="https://issues.apache.org/jira/browse/FLINK-4142">FLINK-4142</a> for details. We recommend using at least Hadoop 2.5.0 for high availability setups on YARN.</p> + #### Example: Highly Available YARN Session 1. **Configure recovery mode and ZooKeeper quorum** in `conf/flink-conf.yaml`: