[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987695#comment-13987695 ] Hudson commented on YARN-1696: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1774 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1774/]) YARN-1696. Added documentation for ResourceManager fail-over. Contributed by Karthik Kambatla, Masatake Iwasaki, Tsuyoshi OZAWA. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1591416) * /hadoop/common/trunk/hadoop-project/src/site/site.xml * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images/rm-ha-overview.png > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Tsuyoshi OZAWA >Priority: Blocker > Fix For: 2.4.1 > > Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, > YARN-1696.4.patch, YARN-1696.6.patch, rm-ha-overview.png, rm-ha-overview.svg, > yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987690#comment-13987690 ] Hudson commented on YARN-1696: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1748 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1748/]) YARN-1696. Added documentation for ResourceManager fail-over. Contributed by Karthik Kambatla, Masatake Iwasaki, Tsuyoshi OZAWA. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1591416) * /hadoop/common/trunk/hadoop-project/src/site/site.xml * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images/rm-ha-overview.png > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Tsuyoshi OZAWA >Priority: Blocker > Fix For: 2.4.1 > > Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, > YARN-1696.4.patch, YARN-1696.6.patch, rm-ha-overview.png, rm-ha-overview.svg, > yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987597#comment-13987597 ] Hudson commented on YARN-1696: -- FAILURE: Integrated in Hadoop-Yarn-trunk #557 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/557/]) YARN-1696. Added documentation for ResourceManager fail-over. Contributed by Karthik Kambatla, Masatake Iwasaki, Tsuyoshi OZAWA. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1591416) * /hadoop/common/trunk/hadoop-project/src/site/site.xml * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images/rm-ha-overview.png > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Tsuyoshi OZAWA >Priority: Blocker > Fix For: 2.4.1 > > Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, > YARN-1696.4.patch, YARN-1696.6.patch, rm-ha-overview.png, rm-ha-overview.svg, > yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986351#comment-13986351 ] Hudson commented on YARN-1696: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5589 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5589/]) YARN-1696. Added documentation for ResourceManager fail-over. Contributed by Karthik Kambatla, Masatake Iwasaki, Tsuyoshi OZAWA. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1591416) * /hadoop/common/trunk/hadoop-project/src/site/site.xml * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images/rm-ha-overview.png > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Tsuyoshi OZAWA >Priority: Blocker > Fix For: 2.4.1 > > Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, > YARN-1696.4.patch, YARN-1696.6.patch, rm-ha-overview.png, rm-ha-overview.svg, > yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985067#comment-13985067 ] Hadoop QA commented on YARN-1696: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12642567/YARN-1696.6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3660//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3660//console This message is automatically generated. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Tsuyoshi OZAWA >Priority: Blocker > Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, > YARN-1696.4.patch, YARN-1696.6.patch, rm-ha-overview.png, rm-ha-overview.svg, > yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983814#comment-13983814 ] Hadoop QA commented on YARN-1696: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12642369/YARN-1676.5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3648//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3648//console This message is automatically generated. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Tsuyoshi OZAWA >Priority: Blocker > Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, > YARN-1696.4.patch, rm-ha-overview.png, rm-ha-overview.svg, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983788#comment-13983788 ] Tsuyoshi OZAWA commented on YARN-1696: -- Updated a document based on Masatake's and Karthik's patch: * Clarified the reason why there is no need to run a separate ZKFC daemon. * Clarified what Web Services means. * Removed sentences not related to fail-over(restart-related topics) * Mentioned ConfiguredRMFailoverProxyProvider as a default value of RMFailoverProxyProvider. * Mentioned FileSystemRMStateStore and ZKRMStateStore and clarified why ZKRMStateStore is preferred. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Tsuyoshi OZAWA >Priority: Blocker > Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, > YARN-1696.4.patch, rm-ha-overview.png, rm-ha-overview.svg, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983596#comment-13983596 ] Tsuyoshi OZAWA commented on YARN-1696: -- I'll try to update doc based on Vinod's comment. Please feel free to free to take it back. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696-3.patch, YARN-1696.2.patch, YARN-1696.4.patch, > rm-ha-overview.png, rm-ha-overview.svg, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13978901#comment-13978901 ] Arun C Murthy commented on YARN-1696: - [~kasha] do you think we can get this in for 2.4.1? Tx. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13955331#comment-13955331 ] Karthik Kambatla commented on YARN-1696: [~acmurthy] - sorry, I was not checking email over the weekend. I can get to this today. Was caught up with other things and given there were other blockers, didn't rush on this. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13955015#comment-13955015 ] Arun C Murthy commented on YARN-1696: - [~kasha] - I'm almost done with rc0, moving this to 2.4.1 - if we need to spin rc1 we can get this in. Else, we can manually put this doc on the site when ready for 2.4.0. Thanks. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13954411#comment-13954411 ] Arun C Murthy commented on YARN-1696: - [~kasha] - You think you can update the doc w/ the feedback quick-ish? Thanks. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948769#comment-13948769 ] Karthik Kambatla commented on YARN-1696: bq. It's late, but after seeing the document, I think we should rename "yarn.resourcemanager.ha." configs to be "yarn.resourcemanager.failover." I would prefer leaving it as ha. I understand your intention that this is primarily failover and not HA in the true sense of the word. I think these configs would apply to Active/Active and true HA, whenever we end up implementing that. Before naming the configs, I did spend sometime thinking about it. Also vaguely remember a conversation with Bikas about future true HA possibilities. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948759#comment-13948759 ] Vinod Kumar Vavilapalli commented on YARN-1696: --- Oh, and I think you are missing an entry in the side-bar menu for this new page.. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948751#comment-13948751 ] Vinod Kumar Vavilapalli commented on YARN-1696: --- Tx for the doc, Karthik. Some comments: - Like I mentioned, fail-over is a big enough topic in itself and so let's split this into two two and call this one the ResourceManager fail-over guide. We can have a top level high-availability doc if we want to and link the two there. - Let's move off the state-store and RM restart stuff out. - "the applications can resume from their last check-pointed state; e.g. completed map tasks in a MapReduce job are not re-run on a subsequent attempt" -> This is not related to fail-over. Let's put it in the restart doc. - " Clients, ApplicationMasters (AMs) and NodeManagers (NMs) try connecting to the RMsin a round-robin fashion" -> Or point that we have ConfigFailOverProvider as the default implementation of an abstraction? - I think we should mention that even though there are two state-store impls, the suggested store is ZK-based store for the sake of fencing. - We should also document the client retry related configs. - Should we give a very basic example configuration of two RMs? The absolute minimum required to enable this? Unrelated to the docs - It's late, but after seeing the document, I think we should rename "yarn.resourcemanager.ha." configs to be "yarn.resourcemanager.failover.". What do others think? Also "rm-ids" is seems weird too. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948727#comment-13948727 ] Fengdong Yu commented on YARN-1696: --- The document is really good. two minor comments: {code} +another RM is automatically elected to be the Active and takes over. Note +that, there is no need to run a separate ZKFC daemon as is the case for +HDFS. {code} A little bit unclear, we cannot suppose HDFS HA is enabled. so It can be "Not that, RM automatic failover share ZKFC with HDFS if your HDFS HA enabled, so there is no need to run a separate ZKFC daemon here." {code} +** Web Services + + The web services automatically redirect to the Active. {code} web services are too general, It could be confused for a new Yarner. so just changed to "RM web UI services" or some meaningful others. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948533#comment-13948533 ] Vinod Kumar Vavilapalli commented on YARN-1696: --- I think the long term goal is to unify them but we aren't there yet. There are users who want to enable RM restart only without failover today. Given that, I think we should keep them separate till we have a complete story. When things come together long term, we can link these together by using a top level doc. Even then, I think after we go through the remaining phases in RM restart, the doc at YARN-1017 will grow more and it's better to do what HDFS did - split into well-defined pages. This doc should instead be renamed "ResourceManager fail-over" and only focus on that for now. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948510#comment-13948510 ] Tsuyoshi OZAWA commented on YARN-1696: -- s/go through/went through/ > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948508#comment-13948508 ] Tsuyoshi OZAWA commented on YARN-1696: -- [~jianhe], [~kkambatl], I also go through the docs. IMO, +1 for Karthik's opinion - we should merge your two documents. From user's point of view, these documents looks very similar, and it can confuses readers. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948493#comment-13948493 ] Karthik Kambatla commented on YARN-1696: Just went through the document there, it should be fairly straight-forward to merge the two documents. Do you want to take a stab at it? Or else, I ll likely be able to take a look tomorrow morning. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948480#comment-13948480 ] Jian He commented on YARN-1696: --- bq. Do you think we could merge the two documents? Uploaded the doc on YARN-1017, let's see we can merge or not. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948355#comment-13948355 ] Hadoop QA commented on YARN-1696: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636976/YARN-1696.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3466//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3466//console This message is automatically generated. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948348#comment-13948348 ] Karthik Kambatla commented on YARN-1696: [~jianhe] - sorry, I wasn't aware of YARN-1017. IMO, we should just have a single document for RM availability. The way I see it, RM restart and RM failover are two parts of our availability story. Do you think we could merge the two documents? > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948339#comment-13948339 ] Jian He commented on YARN-1696: --- Hi Karthik, I have already wrote a document YARN-1017 for RM Restart which talks about the general restart mechanism, there may be some overlap with the doc, I'll upload the doc very soon today. Can you modify some to minimize the overlap ? Thanks! > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948327#comment-13948327 ] Xuan Gong commented on YARN-1696: - upload a new patch to address my previous comment about web-service. Also added "By default, it is enabled only when HA is enabled." for yarn.resourcemanager.ha.automatic-failover.embedded and yarn.resourcemanager.ha.automatic-failover.enabled. [~kkambatl] Please check this. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-1696.2.patch, yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948313#comment-13948313 ] Xuan Gong commented on YARN-1696: - [~kkambatl] Only one thing : {code} +** Web Services + + The web services don't redirect to the Active yet. {code} I think YARN-1658 has already fixed this. Other than that, the documentation looks good to me > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948282#comment-13948282 ] Hadoop QA commented on YARN-1696: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12636962/yarn-1696-1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3465//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3465//console This message is automatically generated. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948230#comment-13948230 ] Xuan Gong commented on YARN-1696: - [~kkambatl] Sure. I will do that. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: yarn-1696-1.patch > > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943822#comment-13943822 ] Arun C Murthy commented on YARN-1696: - Thanks [~kkambatl]. In the worst case we can put your existing docs on jira if we can't get it in early next week and this is the only one blocking 2.4. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942102#comment-13942102 ] Karthik Kambatla commented on YARN-1696: I have some documentation written up - need to port it to the apt format. Have been caught up the last few weeks. Should have this ready by next Wednesday. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942098#comment-13942098 ] Arun C Murthy commented on YARN-1696: - [~kkambatl] Any update on this? Thanks. > Document RM HA > -- > > Key: YARN-1696 > URL: https://issues.apache.org/jira/browse/YARN-1696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > > Add documentation for RM HA. Marking this a blocker for 2.4 as this is > required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)