[jira] [Commented] (YARN-1696) Document RM HA

2014-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987695#comment-13987695
 ] 

Hudson commented on YARN-1696:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1774 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1774/])
YARN-1696. Added documentation for ResourceManager fail-over. Contributed by 
Karthik Kambatla, Masatake Iwasaki, Tsuyoshi OZAWA. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1591416)
* /hadoop/common/trunk/hadoop-project/src/site/site.xml
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images/rm-ha-overview.png


> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Tsuyoshi OZAWA
>Priority: Blocker
> Fix For: 2.4.1
>
> Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, 
> YARN-1696.4.patch, YARN-1696.6.patch, rm-ha-overview.png, rm-ha-overview.svg, 
> yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987690#comment-13987690
 ] 

Hudson commented on YARN-1696:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1748 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1748/])
YARN-1696. Added documentation for ResourceManager fail-over. Contributed by 
Karthik Kambatla, Masatake Iwasaki, Tsuyoshi OZAWA. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1591416)
* /hadoop/common/trunk/hadoop-project/src/site/site.xml
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images/rm-ha-overview.png


> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Tsuyoshi OZAWA
>Priority: Blocker
> Fix For: 2.4.1
>
> Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, 
> YARN-1696.4.patch, YARN-1696.6.patch, rm-ha-overview.png, rm-ha-overview.svg, 
> yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-05-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987597#comment-13987597
 ] 

Hudson commented on YARN-1696:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #557 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/557/])
YARN-1696. Added documentation for ResourceManager fail-over. Contributed by 
Karthik Kambatla, Masatake Iwasaki, Tsuyoshi OZAWA. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1591416)
* /hadoop/common/trunk/hadoop-project/src/site/site.xml
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images/rm-ha-overview.png


> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Tsuyoshi OZAWA
>Priority: Blocker
> Fix For: 2.4.1
>
> Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, 
> YARN-1696.4.patch, YARN-1696.6.patch, rm-ha-overview.png, rm-ha-overview.svg, 
> yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986351#comment-13986351
 ] 

Hudson commented on YARN-1696:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5589 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5589/])
YARN-1696. Added documentation for ResourceManager fail-over. Contributed by 
Karthik Kambatla, Masatake Iwasaki, Tsuyoshi OZAWA. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1591416)
* /hadoop/common/trunk/hadoop-project/src/site/site.xml
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images/rm-ha-overview.png


> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Tsuyoshi OZAWA
>Priority: Blocker
> Fix For: 2.4.1
>
> Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, 
> YARN-1696.4.patch, YARN-1696.6.patch, rm-ha-overview.png, rm-ha-overview.svg, 
> yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-04-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985067#comment-13985067
 ] 

Hadoop QA commented on YARN-1696:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642567/YARN-1696.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3660//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3660//console

This message is automatically generated.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Tsuyoshi OZAWA
>Priority: Blocker
> Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, 
> YARN-1696.4.patch, YARN-1696.6.patch, rm-ha-overview.png, rm-ha-overview.svg, 
> yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983814#comment-13983814
 ] 

Hadoop QA commented on YARN-1696:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12642369/YARN-1676.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3648//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3648//console

This message is automatically generated.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Tsuyoshi OZAWA
>Priority: Blocker
> Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, 
> YARN-1696.4.patch, rm-ha-overview.png, rm-ha-overview.svg, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-04-28 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983788#comment-13983788
 ] 

Tsuyoshi OZAWA commented on YARN-1696:
--

Updated a document based on Masatake's and Karthik's patch:

* Clarified the reason why there is no need to run a separate ZKFC daemon.
* Clarified what Web Services means.
* Removed sentences not related to fail-over(restart-related topics)
* Mentioned ConfiguredRMFailoverProxyProvider as a default value of 
RMFailoverProxyProvider.
* Mentioned FileSystemRMStateStore and ZKRMStateStore and clarified why 
ZKRMStateStore is preferred.



> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Tsuyoshi OZAWA
>Priority: Blocker
> Attachments: YARN-1676.5.patch, YARN-1696-3.patch, YARN-1696.2.patch, 
> YARN-1696.4.patch, rm-ha-overview.png, rm-ha-overview.svg, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-04-28 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983596#comment-13983596
 ] 

Tsuyoshi OZAWA commented on YARN-1696:
--

I'll try to update doc based on Vinod's comment. Please feel free to free to 
take it back.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696-3.patch, YARN-1696.2.patch, YARN-1696.4.patch, 
> rm-ha-overview.png, rm-ha-overview.svg, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-04-23 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13978901#comment-13978901
 ] 

Arun C Murthy commented on YARN-1696:
-

[~kasha] do you think we can get this in for 2.4.1? Tx.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-31 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13955331#comment-13955331
 ] 

Karthik Kambatla commented on YARN-1696:


[~acmurthy] - sorry, I was not checking email over the weekend. I can get to 
this today. Was caught up with other things and given there were other 
blockers, didn't rush on this. 

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-31 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13955015#comment-13955015
 ] 

Arun C Murthy commented on YARN-1696:
-

[~kasha] - I'm almost done with rc0, moving this to 2.4.1 - if we need to spin 
rc1 we can get this in. Else, we can manually put this doc on the site when 
ready for 2.4.0. Thanks.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-29 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13954411#comment-13954411
 ] 

Arun C Murthy commented on YARN-1696:
-

[~kasha] - You think you can update the doc w/ the feedback quick-ish? Thanks.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948769#comment-13948769
 ] 

Karthik Kambatla commented on YARN-1696:


bq. It's late, but after seeing the document, I think we should rename 
"yarn.resourcemanager.ha." configs to be "yarn.resourcemanager.failover."
I would prefer leaving it as ha. I understand your intention that this is 
primarily failover and not HA in the true sense of the word. I think these 
configs would apply to Active/Active and true HA, whenever we end up 
implementing that. Before naming the configs, I did spend sometime thinking 
about it. Also vaguely remember a conversation with Bikas about future true HA 
possibilities.  


> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948759#comment-13948759
 ] 

Vinod Kumar Vavilapalli commented on YARN-1696:
---

Oh, and I think you are missing an entry in the side-bar menu for this new 
page..

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948751#comment-13948751
 ] 

Vinod Kumar Vavilapalli commented on YARN-1696:
---

Tx for the doc, Karthik. Some comments:
 - Like I mentioned, fail-over is a big enough topic in itself and so let's 
split this into two two and call this one the ResourceManager fail-over guide. 
We can have a top level high-availability doc if we want to and link the two 
there.
 - Let's move off the state-store and RM restart stuff out.
 - "the applications can resume from their last check-pointed state; e.g. 
completed map tasks in a MapReduce job are not re-run on a subsequent attempt" 
-> This is not related to fail-over. Let's put it in the restart doc.
 - " Clients, ApplicationMasters (AMs) and NodeManagers (NMs) try connecting to 
the RMsin a round-robin fashion" -> Or point that we have 
ConfigFailOverProvider as the default implementation of an abstraction?
 - I think we should mention that even though there are two state-store impls, 
the suggested store is ZK-based store for the sake of fencing.
 - We should also document the client retry related configs.
 - Should we give a very basic example configuration of two RMs? The absolute 
minimum required to enable this?

Unrelated to the docs
 - It's late, but after seeing the document, I think we should rename 
"yarn.resourcemanager.ha." configs to be "yarn.resourcemanager.failover.". What 
do others think? Also "rm-ids" is seems weird too.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Fengdong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948727#comment-13948727
 ] 

Fengdong Yu commented on YARN-1696:
---

The document is really good.
two minor comments:

{code}
+another RM is automatically elected to be the Active and takes over. Note
+that, there is no need to run a separate ZKFC daemon as is the case for
+HDFS.
{code}

A little bit unclear, we cannot suppose HDFS HA is enabled. so It can be "Not 
that, RM automatic failover share ZKFC with HDFS if your HDFS HA enabled, so 
there is no need to run a separate ZKFC daemon here."

{code}
+** Web Services
+
+   The web services automatically redirect to the Active.
{code}

web services are too general, It could be confused for a new Yarner. so just 
changed to "RM web UI services" or some meaningful others.


> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948533#comment-13948533
 ] 

Vinod Kumar Vavilapalli commented on YARN-1696:
---

I think the long term goal is to unify them but we aren't there yet. There are 
users who want to enable RM restart only without failover today. Given that, I 
think we should keep them separate till we have a complete story. When things 
come together long term, we can link these together by using a top level doc. 
Even then, I think after we go through the remaining phases in RM restart, the 
doc at YARN-1017 will grow more and it's better to do what HDFS did - split 
into well-defined pages.

This doc should instead be renamed "ResourceManager fail-over" and only focus 
on that for now.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948510#comment-13948510
 ] 

Tsuyoshi OZAWA commented on YARN-1696:
--

s/go through/went through/

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948508#comment-13948508
 ] 

Tsuyoshi OZAWA commented on YARN-1696:
--

[~jianhe], [~kkambatl], I also go through the docs. IMO, +1 for Karthik's 
opinion - we should merge your two documents. From user's point of view, these 
documents looks very similar, and it can confuses readers.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948493#comment-13948493
 ] 

Karthik Kambatla commented on YARN-1696:


Just went through the document there, it should be fairly straight-forward to 
merge the two documents. Do you want to take a stab at it? Or else, I ll likely 
be able to take a look tomorrow morning. 

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948480#comment-13948480
 ] 

Jian He commented on YARN-1696:
---

bq. Do you think we could merge the two documents?
Uploaded the doc on YARN-1017, let's see we can merge or not.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948355#comment-13948355
 ] 

Hadoop QA commented on YARN-1696:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636976/YARN-1696.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3466//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3466//console

This message is automatically generated.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948348#comment-13948348
 ] 

Karthik Kambatla commented on YARN-1696:


[~jianhe] - sorry, I wasn't aware of YARN-1017. 

IMO, we should just have a single document for RM availability. The way I see 
it, RM restart and RM failover are two parts of our availability story. Do you 
think we could merge the two documents?

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948339#comment-13948339
 ] 

Jian He commented on YARN-1696:
---

Hi Karthik, I have already wrote a document YARN-1017 for RM Restart which 
talks about the general restart mechanism, there may be some overlap with the 
doc, I'll upload the doc very soon today. Can you modify some to minimize the 
overlap ? Thanks!

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948327#comment-13948327
 ] 

Xuan Gong commented on YARN-1696:
-

upload a new patch to address my previous comment about web-service.
Also added "By default, it is enabled only when HA is enabled." for 
yarn.resourcemanager.ha.automatic-failover.embedded and 
yarn.resourcemanager.ha.automatic-failover.enabled. 

[~kkambatl] Please check this.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-1696.2.patch, yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948313#comment-13948313
 ] 

Xuan Gong commented on YARN-1696:
-

[~kkambatl] Only one thing :
{code}
+** Web Services
+
+   The web services don't redirect to the Active yet.
{code}

I think YARN-1658 has already fixed this.

Other than that, the documentation looks good to me



> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948282#comment-13948282
 ] 

Hadoop QA commented on YARN-1696:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636962/yarn-1696-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3465//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3465//console

This message is automatically generated.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-26 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948230#comment-13948230
 ] 

Xuan Gong commented on YARN-1696:
-

[~kkambatl] Sure. I will do that.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: yarn-1696-1.patch
>
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-21 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943822#comment-13943822
 ] 

Arun C Murthy commented on YARN-1696:
-

Thanks [~kkambatl]. In the worst case we can put your existing docs on jira if 
we can't get it in early next week and this is the only one blocking 2.4.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942102#comment-13942102
 ] 

Karthik Kambatla commented on YARN-1696:


I have some documentation written up - need to port it to the apt format. Have 
been caught up the last few weeks. Should have this ready by next Wednesday. 

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-20 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942098#comment-13942098
 ] 

Arun C Murthy commented on YARN-1696:
-

[~kkambatl] Any update on this? Thanks.

> Document RM HA
> --
>
> Key: YARN-1696
> URL: https://issues.apache.org/jira/browse/YARN-1696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Blocker
>
> Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
> required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)