[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot

2013-10-25 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805825#comment-13805825
 ] 

Omkar Vinit Joshi commented on YARN-1350:
-

[~sinchii] I have basic question..why your nodeId is changing everytime? have 
you configured your nodemanager with ephemeral port (0) ? what is NM_ADDRESS? 
RM will consider this as same node only when your newly restarted node manager 
reports with same node id .. i.e. host-name:port

 Should not add Lost Node by NodeManager reboot
 --

 Key: YARN-1350
 URL: https://issues.apache.org/jira/browse/YARN-1350
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Shinichi Yamashita
 Attachments: NodeState.txt


 In current trunk, when NodeManager reboots, the node information before the 
 reboot is treated as LOST.
 This occurs to confirm only Inactive node information at the time of reboot.
 Therefore Lost Node will exist even if NodeManager works in all nodes.
 We should change it not to register Lost Node by the NodeManager reboot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot

2013-10-25 Thread Shinichi Yamashita (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805862#comment-13805862
 ] 

Shinichi Yamashita commented on YARN-1350:
--

In other words, you say that I don't have any problem if I fix port number in 
yarn.nodemanager.address property.
And this problem will not surely occur.
But then it should set a fixed appropriate port number like 
yarn.resourcemanager.address in yarn-default.xml and default port number. Why 
is 0?


 Should not add Lost Node by NodeManager reboot
 --

 Key: YARN-1350
 URL: https://issues.apache.org/jira/browse/YARN-1350
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Shinichi Yamashita
 Attachments: NodeState.txt


 In current trunk, when NodeManager reboots, the node information before the 
 reboot is treated as LOST.
 This occurs to confirm only Inactive node information at the time of reboot.
 Therefore Lost Node will exist even if NodeManager works in all nodes.
 We should change it not to register Lost Node by the NodeManager reboot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot

2013-10-25 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805867#comment-13805867
 ] 

Omkar Vinit Joshi commented on YARN-1350:
-

That is mainly for single node cluster to avoid port clashing. For real cluster 
you should define a port there. If you agree I will close this as invalid.

 Should not add Lost Node by NodeManager reboot
 --

 Key: YARN-1350
 URL: https://issues.apache.org/jira/browse/YARN-1350
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Shinichi Yamashita
 Attachments: NodeState.txt


 In current trunk, when NodeManager reboots, the node information before the 
 reboot is treated as LOST.
 This occurs to confirm only Inactive node information at the time of reboot.
 Therefore Lost Node will exist even if NodeManager works in all nodes.
 We should change it not to register Lost Node by the NodeManager reboot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot

2013-10-25 Thread Shinichi Yamashita (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805900#comment-13805900
 ] 

Shinichi Yamashita commented on YARN-1350:
--

Why port clashing? For example, does it use multiple NodeManager with one 
server?

 Should not add Lost Node by NodeManager reboot
 --

 Key: YARN-1350
 URL: https://issues.apache.org/jira/browse/YARN-1350
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Shinichi Yamashita
Assignee: Omkar Vinit Joshi
 Attachments: NodeState.txt


 In current trunk, when NodeManager reboots, the node information before the 
 reboot is treated as LOST.
 This occurs to confirm only Inactive node information at the time of reboot.
 Therefore Lost Node will exist even if NodeManager works in all nodes.
 We should change it not to register Lost Node by the NodeManager reboot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot

2013-10-25 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805908#comment-13805908
 ] 

Omkar Vinit Joshi commented on YARN-1350:
-

you should checkout MiniYarnCluster.

 Should not add Lost Node by NodeManager reboot
 --

 Key: YARN-1350
 URL: https://issues.apache.org/jira/browse/YARN-1350
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Shinichi Yamashita
Assignee: Omkar Vinit Joshi
 Attachments: NodeState.txt


 In current trunk, when NodeManager reboots, the node information before the 
 reboot is treated as LOST.
 This occurs to confirm only Inactive node information at the time of reboot.
 Therefore Lost Node will exist even if NodeManager works in all nodes.
 We should change it not to register Lost Node by the NodeManager reboot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot

2013-10-25 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805921#comment-13805921
 ] 

Akira AJISAKA commented on YARN-1350:
-

IMO, there are two options for not causing this problem.
 * Document to fix port number in real cluster.
 * Change yarn-default.xml to fix port number and MiniYarnCluster to use 0.

 Should not add Lost Node by NodeManager reboot
 --

 Key: YARN-1350
 URL: https://issues.apache.org/jira/browse/YARN-1350
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Shinichi Yamashita
Assignee: Omkar Vinit Joshi
 Attachments: NodeState.txt


 In current trunk, when NodeManager reboots, the node information before the 
 reboot is treated as LOST.
 This occurs to confirm only Inactive node information at the time of reboot.
 Therefore Lost Node will exist even if NodeManager works in all nodes.
 We should change it not to register Lost Node by the NodeManager reboot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1350) Should not add Lost Node by NodeManager reboot

2013-10-25 Thread Shinichi Yamashita (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805932#comment-13805932
 ] 

Shinichi Yamashita commented on YARN-1350:
--

Thank you for additional information. I understand that the port number sets 0 
to use multiple NodeManager for a test in MiniYARNCluster.
And it seems to be easy to understand that there is notes setting it in real 
cluster in yarn-default.xml description.

 Should not add Lost Node by NodeManager reboot
 --

 Key: YARN-1350
 URL: https://issues.apache.org/jira/browse/YARN-1350
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Shinichi Yamashita
Assignee: Omkar Vinit Joshi
 Attachments: NodeState.txt


 In current trunk, when NodeManager reboots, the node information before the 
 reboot is treated as LOST.
 This occurs to confirm only Inactive node information at the time of reboot.
 Therefore Lost Node will exist even if NodeManager works in all nodes.
 We should change it not to register Lost Node by the NodeManager reboot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)