[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart

2015-12-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066945#comment-15066945
 ] 

Wangda Tan commented on YARN-4454:
--

Looks good, +1. thanks [~bibinchundatt]! Committing.. 

> NM to nodelabel mapping going wrong after RM restart
> 
>
> Key: YARN-4454
> URL: https://issues.apache.org/jira/browse/YARN-4454
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Critical
> Attachments: 0001-YARN-4454.patch, test.patch
>
>
> *Nodelabel mapping with NodeManager  is going wrong if combination of 
> hostname and then NodeId is used to update nodelabel mapping*
> *Steps to reproduce*
> 1.Create cluster with 2 NM
> 2.Add label X,Y to cluster
> 3.replace  Label of node  1 using ,x
> 4.replace label for node 1 by ,y
> 5.Again replace label of node 1 by ,x
> Check cluster label mapping HOSTNAME1 will be mapped with X 
> Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y
> {noformat}
> 2015-12-14 17:17:54,901 INFO 
> org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: 
> 

[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart

2015-12-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066987#comment-15066987
 ] 

Hudson commented on YARN-4454:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9009 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9009/])
YARN-4454. NM to nodelabel mapping going wrong after RM restart. (Bibin 
(wangda: rev bc038b382cb2ce561ce718405fbcee4382f3b204)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java


> NM to nodelabel mapping going wrong after RM restart
> 
>
> Key: YARN-4454
> URL: https://issues.apache.org/jira/browse/YARN-4454
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4454.patch, test.patch
>
>
> *Nodelabel mapping with NodeManager  is going wrong if combination of 
> hostname and then NodeId is used to update nodelabel mapping*
> *Steps to reproduce*
> 1.Create cluster with 2 NM
> 2.Add label X,Y to cluster
> 3.replace  Label of node  1 using ,x
> 4.replace label for node 1 by ,y
> 5.Again replace label of node 1 by ,x
> Check cluster label mapping HOSTNAME1 will be mapped with X 
> Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y
> {noformat}
> 2015-12-14 17:17:54,901 INFO 
> org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: 
> 

[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart

2015-12-21 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067347#comment-15067347
 ] 

Bibin A Chundatt commented on YARN-4454:


[~leftnoteasy] Thank you for review and commit.

> NM to nodelabel mapping going wrong after RM restart
> 
>
> Key: YARN-4454
> URL: https://issues.apache.org/jira/browse/YARN-4454
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4454.patch, test.patch
>
>
> *Nodelabel mapping with NodeManager  is going wrong if combination of 
> hostname and then NodeId is used to update nodelabel mapping*
> *Steps to reproduce*
> 1.Create cluster with 2 NM
> 2.Add label X,Y to cluster
> 3.replace  Label of node  1 using ,x
> 4.replace label for node 1 by ,y
> 5.Again replace label of node 1 by ,x
> Check cluster label mapping HOSTNAME1 will be mapped with X 
> Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y
> {noformat}
> 2015-12-14 17:17:54,901 INFO 
> org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: 
> 

[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart

2015-12-20 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066013#comment-15066013
 ] 

Bibin A Chundatt commented on YARN-4454:


Test failures are already tracked as part of umbrella JIRA YARN-4474

> NM to nodelabel mapping going wrong after RM restart
> 
>
> Key: YARN-4454
> URL: https://issues.apache.org/jira/browse/YARN-4454
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Critical
> Attachments: 0001-YARN-4454.patch, test.patch
>
>
> *Nodelabel mapping with NodeManager  is going wrong if combination of 
> hostname and then NodeId is used to update nodelabel mapping*
> *Steps to reproduce*
> 1.Create cluster with 2 NM
> 2.Add label X,Y to cluster
> 3.replace  Label of node  1 using ,x
> 4.replace label for node 1 by ,y
> 5.Again replace label of node 1 by ,x
> Check cluster label mapping HOSTNAME1 will be mapped with X 
> Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y
> {noformat}
> 2015-12-14 17:17:54,901 INFO 
> org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: 
> 

[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart

2015-12-20 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066015#comment-15066015
 ] 

Bibin A Chundatt commented on YARN-4454:


Sorry i mentioned wrong jira ID its YARN-4478

> NM to nodelabel mapping going wrong after RM restart
> 
>
> Key: YARN-4454
> URL: https://issues.apache.org/jira/browse/YARN-4454
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Critical
> Attachments: 0001-YARN-4454.patch, test.patch
>
>
> *Nodelabel mapping with NodeManager  is going wrong if combination of 
> hostname and then NodeId is used to update nodelabel mapping*
> *Steps to reproduce*
> 1.Create cluster with 2 NM
> 2.Add label X,Y to cluster
> 3.replace  Label of node  1 using ,x
> 4.replace label for node 1 by ,y
> 5.Again replace label of node 1 by ,x
> Check cluster label mapping HOSTNAME1 will be mapped with X 
> Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y
> {noformat}
> 2015-12-14 17:17:54,901 INFO 
> org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: 
> 

[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart

2015-12-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15065846#comment-15065846
 ] 

Hadoop QA commented on YARN-4454:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
10s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 54s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 28s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
3s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
46s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
13s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 7s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
3s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 44s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 59s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 59s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 29s {color} 
| {color:red} hadoop-mapreduce-client-app in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 16s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 42s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 10m 9s {color} 
| {color:red} hadoop-mapreduce-client-app in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 24s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 228m 58s {color} 
| 

[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart

2015-12-18 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15064513#comment-15064513
 ] 

Wangda Tan commented on YARN-4454:
--

[~bibinchundatt], thanks for reporting and looking at the issue. 

The root cause of this issue is, when the RM restart first time, it will 
generate a mirror file which has a complete node->label mappings:
{code}
node1:port=x 
node1=y
{code}

And when we restart the RM again, we will load the mapping, but node1:port 
loaded first, so node1=y will overwrite the previous one.

In: 
{{org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager#checkReplaceLabelsOnNode}}

Instead of directly iterate the map:
{code}
for (Entry entry : replaceLabelsToNode.entrySet()) {
  NodeId nodeId = entry.getKey();
{code}
We should sort the map so that the node without port should be handled first 
before node with port specified to avoid overwriting happens.

Is it make sense to you?

> NM to nodelabel mapping going wrong after RM restart
> 
>
> Key: YARN-4454
> URL: https://issues.apache.org/jira/browse/YARN-4454
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Critical
> Attachments: test.patch
>
>
> *Nodelabel mapping with NodeManager  is going wrong if combination of 
> hostname and then NodeId is used to update nodelabel mapping*
> *Steps to reproduce*
> 1.Create cluster with 2 NM
> 2.Add label X,Y to cluster
> 3.replace  Label of node  1 using ,x
> 4.replace label for node 1 by ,y
> 5.Again replace label of node 1 by ,x
> Check cluster label mapping HOSTNAME1 will be mapped with X 
> Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y
> {noformat}
> 2015-12-14 17:17:54,901 INFO 
> org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: 
> 

[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart

2015-12-16 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061476#comment-15061476
 ] 

Bibin A Chundatt commented on YARN-4454:


On 2nd time recovery the ordering is going wrong 

{noformat}
2015-12-14 17:17:54,906 INFO 
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager:   
NM=host-10-19-92-188:64318, labels=[ResourcePool_1]
2015-12-14 17:17:54,906 INFO 
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager:   
NM=host-10-19-92-188:0, labels=[ResourcePool_null]{noformat}




> NM to nodelabel mapping going wrong after RM restart
> 
>
> Key: YARN-4454
> URL: https://issues.apache.org/jira/browse/YARN-4454
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Critical
>
> *Steps to reproduce*
> 1.Create cluster with 2 NM
> 2.Add label X,Y to cluster
> 3.replace  Label of node  1 using ,x
> 4.replace label for node 1 by ,y
> 5.Again replace label of node 1 by ,x
> Check cluster label mapping HOSTNAME1 will be mapped with X 
> Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y
> {noformat}
> 2015-12-14 17:17:54,901 INFO 
> org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: 
>