[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496385#comment-14496385
 ] 

Hudson commented on YARN-3266:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2114 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2114/])
YARN-3266. RMContext#inactiveNodes should have NodeId as map key. Contributed 
by Chengbing Liu (jianhe: rev b46ee1e7a31007985b88072d9af3d97c33a261a7)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java


> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Fix For: 2.8.0
>
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch, 
> YARN-3266.03.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496306#comment-14496306
 ] 

Hudson commented on YARN-3266:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #165 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/165/])
YARN-3266. RMContext#inactiveNodes should have NodeId as map key. Contributed 
by Chengbing Liu (jianhe: rev b46ee1e7a31007985b88072d9af3d97c33a261a7)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* hadoop-yarn-project/CHANGES.txt


> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Fix For: 2.8.0
>
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch, 
> YARN-3266.03.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496077#comment-14496077
 ] 

Hudson commented on YARN-3266:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #155 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/155/])
YARN-3266. RMContext#inactiveNodes should have NodeId as map key. Contributed 
by Chengbing Liu (jianhe: rev b46ee1e7a31007985b88072d9af3d97c33a261a7)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebApp.java


> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Fix For: 2.8.0
>
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch, 
> YARN-3266.03.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496070#comment-14496070
 ] 

Hudson commented on YARN-3266:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2096 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2096/])
YARN-3266. RMContext#inactiveNodes should have NodeId as map key. Contributed 
by Chengbing Liu (jianhe: rev b46ee1e7a31007985b88072d9af3d97c33a261a7)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebApp.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java


> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Fix For: 2.8.0
>
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch, 
> YARN-3266.03.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496025#comment-14496025
 ] 

Hudson commented on YARN-3266:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #898 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/898/])
YARN-3266. RMContext#inactiveNodes should have NodeId as map key. Contributed 
by Chengbing Liu (jianhe: rev b46ee1e7a31007985b88072d9af3d97c33a261a7)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java


> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Fix For: 2.8.0
>
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch, 
> YARN-3266.03.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496017#comment-14496017
 ] 

Hudson commented on YARN-3266:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #164 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/164/])
YARN-3266. RMContext#inactiveNodes should have NodeId as map key. Contributed 
by Chengbing Liu (jianhe: rev b46ee1e7a31007985b88072d9af3d97c33a261a7)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebApp.java


> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Fix For: 2.8.0
>
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch, 
> YARN-3266.03.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-14 Thread Chengbing Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495543#comment-14495543
 ] 

Chengbing Liu commented on YARN-3266:
-

Thanks [~jianhe] for review and committing!

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Fix For: 2.8.0
>
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch, 
> YARN-3266.03.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14494569#comment-14494569
 ] 

Hudson commented on YARN-3266:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7584 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7584/])
YARN-3266. RMContext#inactiveNodes should have NodeId as map key. Contributed 
by Chengbing Liu (jianhe: rev b46ee1e7a31007985b88072d9af3d97c33a261a7)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMActiveServiceContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java


> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Fix For: 2.8.0
>
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch, 
> YARN-3266.03.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-14 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14494483#comment-14494483
 ] 

Jian He commented on YARN-3266:
---

+1 

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch, 
> YARN-3266.03.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14494156#comment-14494156
 ] 

Hadoop QA commented on YARN-3266:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12725212/YARN-3266.03.patch
  against trunk revision b5a0b24.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7329//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7329//console

This message is automatically generated.

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch, 
> YARN-3266.03.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-14 Thread Chengbing Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493847#comment-14493847
 ] 

Chengbing Liu commented on YARN-3266:
-

I agree with you. Will upload a new version with NodeId as key.

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-13 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493506#comment-14493506
 ] 

Jian He commented on YARN-3266:
---

since it's internal to RM only, I think it's ok. your opinion ?

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-13 Thread Chengbing Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493414#comment-14493414
 ] 

Chengbing Liu commented on YARN-3266:
-

{quote}
I found given that "RMContext#getRMNodes" uses NodeId, do you think 
"RMContext#getInactiveRMNodes" can use NodeId as the key too for consistency ?
{quote}
To me the only concern is that this will change the {{RMContext}} interface. Do 
you think that's OK?

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-13 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493063#comment-14493063
 ] 

Jian He commented on YARN-3266:
---

I found given that "RMContext#getRMNodes" uses NodeId, do you think 
"RMContext#getInactiveRMNodes" can use NodeId as the key too for consistency ?

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-13 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493033#comment-14493033
 ] 

Jian He commented on YARN-3266:
---

lgtm,  +1

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-04-09 Thread Chengbing Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14487265#comment-14487265
 ] 

Chengbing Liu commented on YARN-3266:
-

[~jianhe], would you like to take a look at this? Thanks!

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-02-26 Thread Chengbing Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339657#comment-14339657
 ] 

Chengbing Liu commented on YARN-3266:
-

The findbugs warnings are unrelated, caused by YARN-3181 and handled by 
YARN-3204.

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-02-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338318#comment-14338318
 ] 

Hadoop QA commented on YARN-3266:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12701022/YARN-3266.02.patch
  against trunk revision 0d4296f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6755//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6755//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6755//console

This message is automatically generated.

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-02-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338232#comment-14338232
 ] 

Hadoop QA commented on YARN-3266:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12701009/YARN-3266.01.patch
  against trunk revision 166eecf.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6754//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6754//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6754//console

This message is automatically generated.

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3266.01.patch, YARN-3266.02.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-02-26 Thread Chengbing Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338161#comment-14338161
 ] 

Chengbing Liu commented on YARN-3266:
-

uploaded a patch, taking over...

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: YARN-3266.01.patch
>
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3266) RMContext inactiveNodes should have NodeId as map key

2015-02-26 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338106#comment-14338106
 ] 

Rohith commented on YARN-3266:
--

bq. the key string should include the NM's port as well
This make sense to me instead of changing API. Taking over now, feel free to 
assign yourself if you have already started working on this.

> RMContext inactiveNodes should have NodeId as map key
> -
>
> Key: YARN-3266
> URL: https://issues.apache.org/jira/browse/YARN-3266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>
> Under the default NM port configuration, which is 0, we have observed in the 
> current version, "lost nodes" count is greater than the length of the lost 
> node list. This will happen when we consecutively restart the same NM twice:
> * NM started at port 10001
> * NM restarted at port 10002
> * NM restarted at port 10003
> * NM:10001 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=1; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} has 1 element
> * NM:10002 timeout, {{ClusterMetrics#incrNumLostNMs()}}, # lost node=2; 
> {{rmNode.context.getInactiveRMNodes().put(rmNode.nodeId.getHost(), rmNode)}}, 
> {{inactiveNodes}} still has 1 element
> Since we allow multiple NodeManagers on one host (as discussed in YARN-1888), 
> {{inactiveNodes}} should be of type {{ConcurrentMap}}. If 
> this will break the current API, then the key string should include the NM's 
> port as well.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)