[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart
[ https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066945#comment-15066945 ] Wangda Tan commented on YARN-4454: -- Looks good, +1. thanks [~bibinchundatt]! Committing.. > NM to nodelabel mapping going wrong after RM restart > > > Key: YARN-4454 > URL: https://issues.apache.org/jira/browse/YARN-4454 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: 0001-YARN-4454.patch, test.patch > > > *Nodelabel mapping with NodeManager is going wrong if combination of > hostname and then NodeId is used to update nodelabel mapping* > *Steps to reproduce* > 1.Create cluster with 2 NM > 2.Add label X,Y to cluster > 3.replace Label of node 1 using ,x > 4.replace label for node 1 by ,y > 5.Again replace label of node 1 by ,x > Check cluster label mapping HOSTNAME1 will be mapped with X > Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y > {noformat} > 2015-12-14 17:17:54,901 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: >
[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart
[ https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066987#comment-15066987 ] Hudson commented on YARN-4454: -- FAILURE: Integrated in Hadoop-trunk-Commit #9009 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9009/]) YARN-4454. NM to nodelabel mapping going wrong after RM restart. (Bibin (wangda: rev bc038b382cb2ce561ce718405fbcee4382f3b204) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java > NM to nodelabel mapping going wrong after RM restart > > > Key: YARN-4454 > URL: https://issues.apache.org/jira/browse/YARN-4454 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Fix For: 2.8.0 > > Attachments: 0001-YARN-4454.patch, test.patch > > > *Nodelabel mapping with NodeManager is going wrong if combination of > hostname and then NodeId is used to update nodelabel mapping* > *Steps to reproduce* > 1.Create cluster with 2 NM > 2.Add label X,Y to cluster > 3.replace Label of node 1 using ,x > 4.replace label for node 1 by ,y > 5.Again replace label of node 1 by ,x > Check cluster label mapping HOSTNAME1 will be mapped with X > Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y > {noformat} > 2015-12-14 17:17:54,901 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: >
[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart
[ https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067347#comment-15067347 ] Bibin A Chundatt commented on YARN-4454: [~leftnoteasy] Thank you for review and commit. > NM to nodelabel mapping going wrong after RM restart > > > Key: YARN-4454 > URL: https://issues.apache.org/jira/browse/YARN-4454 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Fix For: 2.8.0 > > Attachments: 0001-YARN-4454.patch, test.patch > > > *Nodelabel mapping with NodeManager is going wrong if combination of > hostname and then NodeId is used to update nodelabel mapping* > *Steps to reproduce* > 1.Create cluster with 2 NM > 2.Add label X,Y to cluster > 3.replace Label of node 1 using ,x > 4.replace label for node 1 by ,y > 5.Again replace label of node 1 by ,x > Check cluster label mapping HOSTNAME1 will be mapped with X > Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y > {noformat} > 2015-12-14 17:17:54,901 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: >
[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart
[ https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066013#comment-15066013 ] Bibin A Chundatt commented on YARN-4454: Test failures are already tracked as part of umbrella JIRA YARN-4474 > NM to nodelabel mapping going wrong after RM restart > > > Key: YARN-4454 > URL: https://issues.apache.org/jira/browse/YARN-4454 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: 0001-YARN-4454.patch, test.patch > > > *Nodelabel mapping with NodeManager is going wrong if combination of > hostname and then NodeId is used to update nodelabel mapping* > *Steps to reproduce* > 1.Create cluster with 2 NM > 2.Add label X,Y to cluster > 3.replace Label of node 1 using ,x > 4.replace label for node 1 by ,y > 5.Again replace label of node 1 by ,x > Check cluster label mapping HOSTNAME1 will be mapped with X > Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y > {noformat} > 2015-12-14 17:17:54,901 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: >
[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart
[ https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066015#comment-15066015 ] Bibin A Chundatt commented on YARN-4454: Sorry i mentioned wrong jira ID its YARN-4478 > NM to nodelabel mapping going wrong after RM restart > > > Key: YARN-4454 > URL: https://issues.apache.org/jira/browse/YARN-4454 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: 0001-YARN-4454.patch, test.patch > > > *Nodelabel mapping with NodeManager is going wrong if combination of > hostname and then NodeId is used to update nodelabel mapping* > *Steps to reproduce* > 1.Create cluster with 2 NM > 2.Add label X,Y to cluster > 3.replace Label of node 1 using ,x > 4.replace label for node 1 by ,y > 5.Again replace label of node 1 by ,x > Check cluster label mapping HOSTNAME1 will be mapped with X > Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y > {noformat} > 2015-12-14 17:17:54,901 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: >
[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart
[ https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15065846#comment-15065846 ] Hadoop QA commented on YARN-4454: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 54s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 28s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 13s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 7s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 59s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 59s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 29s {color} | {color:red} hadoop-mapreduce-client-app in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 16s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 42s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 10m 9s {color} | {color:red} hadoop-mapreduce-client-app in the patch failed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 24s {color} | {color:red} Patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 228m 58s {color} |
[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart
[ https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15064513#comment-15064513 ] Wangda Tan commented on YARN-4454: -- [~bibinchundatt], thanks for reporting and looking at the issue. The root cause of this issue is, when the RM restart first time, it will generate a mirror file which has a complete node->label mappings: {code} node1:port=x node1=y {code} And when we restart the RM again, we will load the mapping, but node1:port loaded first, so node1=y will overwrite the previous one. In: {{org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager#checkReplaceLabelsOnNode}} Instead of directly iterate the map: {code} for (Entryentry : replaceLabelsToNode.entrySet()) { NodeId nodeId = entry.getKey(); {code} We should sort the map so that the node without port should be handled first before node with port specified to avoid overwriting happens. Is it make sense to you? > NM to nodelabel mapping going wrong after RM restart > > > Key: YARN-4454 > URL: https://issues.apache.org/jira/browse/YARN-4454 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: test.patch > > > *Nodelabel mapping with NodeManager is going wrong if combination of > hostname and then NodeId is used to update nodelabel mapping* > *Steps to reproduce* > 1.Create cluster with 2 NM > 2.Add label X,Y to cluster > 3.replace Label of node 1 using ,x > 4.replace label for node 1 by ,y > 5.Again replace label of node 1 by ,x > Check cluster label mapping HOSTNAME1 will be mapped with X > Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y > {noformat} > 2015-12-14 17:17:54,901 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: >
[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart
[ https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061476#comment-15061476 ] Bibin A Chundatt commented on YARN-4454: On 2nd time recovery the ordering is going wrong {noformat} 2015-12-14 17:17:54,906 INFO org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: NM=host-10-19-92-188:64318, labels=[ResourcePool_1] 2015-12-14 17:17:54,906 INFO org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: NM=host-10-19-92-188:0, labels=[ResourcePool_null]{noformat} > NM to nodelabel mapping going wrong after RM restart > > > Key: YARN-4454 > URL: https://issues.apache.org/jira/browse/YARN-4454 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > > *Steps to reproduce* > 1.Create cluster with 2 NM > 2.Add label X,Y to cluster > 3.replace Label of node 1 using ,x > 4.replace label for node 1 by ,y > 5.Again replace label of node 1 by ,x > Check cluster label mapping HOSTNAME1 will be mapped with X > Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y > {noformat} > 2015-12-14 17:17:54,901 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: >