[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot
[ https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805825#comment-13805825 ] Omkar Vinit Joshi commented on YARN-1350: - [~sinchii] I have basic question..why your nodeId is changing everytime? have you configured your nodemanager with ephemeral port (0) ? what is NM_ADDRESS? RM will consider this as same node only when your newly restarted node manager reports with same node id .. i.e. host-name:port Should not add Lost Node by NodeManager reboot -- Key: YARN-1350 URL: https://issues.apache.org/jira/browse/YARN-1350 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0 Reporter: Shinichi Yamashita Attachments: NodeState.txt In current trunk, when NodeManager reboots, the node information before the reboot is treated as LOST. This occurs to confirm only Inactive node information at the time of reboot. Therefore Lost Node will exist even if NodeManager works in all nodes. We should change it not to register Lost Node by the NodeManager reboot. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot
[ https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805862#comment-13805862 ] Shinichi Yamashita commented on YARN-1350: -- In other words, you say that I don't have any problem if I fix port number in yarn.nodemanager.address property. And this problem will not surely occur. But then it should set a fixed appropriate port number like yarn.resourcemanager.address in yarn-default.xml and default port number. Why is 0? Should not add Lost Node by NodeManager reboot -- Key: YARN-1350 URL: https://issues.apache.org/jira/browse/YARN-1350 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0 Reporter: Shinichi Yamashita Attachments: NodeState.txt In current trunk, when NodeManager reboots, the node information before the reboot is treated as LOST. This occurs to confirm only Inactive node information at the time of reboot. Therefore Lost Node will exist even if NodeManager works in all nodes. We should change it not to register Lost Node by the NodeManager reboot. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot
[ https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805867#comment-13805867 ] Omkar Vinit Joshi commented on YARN-1350: - That is mainly for single node cluster to avoid port clashing. For real cluster you should define a port there. If you agree I will close this as invalid. Should not add Lost Node by NodeManager reboot -- Key: YARN-1350 URL: https://issues.apache.org/jira/browse/YARN-1350 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0 Reporter: Shinichi Yamashita Attachments: NodeState.txt In current trunk, when NodeManager reboots, the node information before the reboot is treated as LOST. This occurs to confirm only Inactive node information at the time of reboot. Therefore Lost Node will exist even if NodeManager works in all nodes. We should change it not to register Lost Node by the NodeManager reboot. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot
[ https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805900#comment-13805900 ] Shinichi Yamashita commented on YARN-1350: -- Why port clashing? For example, does it use multiple NodeManager with one server? Should not add Lost Node by NodeManager reboot -- Key: YARN-1350 URL: https://issues.apache.org/jira/browse/YARN-1350 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0 Reporter: Shinichi Yamashita Assignee: Omkar Vinit Joshi Attachments: NodeState.txt In current trunk, when NodeManager reboots, the node information before the reboot is treated as LOST. This occurs to confirm only Inactive node information at the time of reboot. Therefore Lost Node will exist even if NodeManager works in all nodes. We should change it not to register Lost Node by the NodeManager reboot. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot
[ https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805908#comment-13805908 ] Omkar Vinit Joshi commented on YARN-1350: - you should checkout MiniYarnCluster. Should not add Lost Node by NodeManager reboot -- Key: YARN-1350 URL: https://issues.apache.org/jira/browse/YARN-1350 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0 Reporter: Shinichi Yamashita Assignee: Omkar Vinit Joshi Attachments: NodeState.txt In current trunk, when NodeManager reboots, the node information before the reboot is treated as LOST. This occurs to confirm only Inactive node information at the time of reboot. Therefore Lost Node will exist even if NodeManager works in all nodes. We should change it not to register Lost Node by the NodeManager reboot. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot
[ https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805921#comment-13805921 ] Akira AJISAKA commented on YARN-1350: - IMO, there are two options for not causing this problem. * Document to fix port number in real cluster. * Change yarn-default.xml to fix port number and MiniYarnCluster to use 0. Should not add Lost Node by NodeManager reboot -- Key: YARN-1350 URL: https://issues.apache.org/jira/browse/YARN-1350 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0 Reporter: Shinichi Yamashita Assignee: Omkar Vinit Joshi Attachments: NodeState.txt In current trunk, when NodeManager reboots, the node information before the reboot is treated as LOST. This occurs to confirm only Inactive node information at the time of reboot. Therefore Lost Node will exist even if NodeManager works in all nodes. We should change it not to register Lost Node by the NodeManager reboot. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot
[ https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805932#comment-13805932 ] Shinichi Yamashita commented on YARN-1350: -- Thank you for additional information. I understand that the port number sets 0 to use multiple NodeManager for a test in MiniYARNCluster. And it seems to be easy to understand that there is notes setting it in real cluster in yarn-default.xml description. Should not add Lost Node by NodeManager reboot -- Key: YARN-1350 URL: https://issues.apache.org/jira/browse/YARN-1350 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0 Reporter: Shinichi Yamashita Assignee: Omkar Vinit Joshi Attachments: NodeState.txt In current trunk, when NodeManager reboots, the node information before the reboot is treated as LOST. This occurs to confirm only Inactive node information at the time of reboot. Therefore Lost Node will exist even if NodeManager works in all nodes. We should change it not to register Lost Node by the NodeManager reboot. -- This message was sent by Atlassian JIRA (v6.1#6144)