[jira] [Commented] (YARN-3194) RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node
[ https://issues.apache.org/jira/browse/YARN-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14332196#comment-14332196 ] Hudson commented on YARN-3194: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #112 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/112/]) YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * hadoop-yarn-project/CHANGES.txt RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node --- Key: YARN-3194 URL: https://issues.apache.org/jira/browse/YARN-3194 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: NM restart is enabled Reporter: Rohith Assignee: Rohith Priority: Blocker Fix For: 2.7.0 Attachments: 0001-YARN-3194.patch, 0001-yarn-3194-v1.patch On NM restart ,NM sends all the outstanding NMContainerStatus to RM during registration. The registration can be treated by RM as New node or Reconnecting node. RM triggers corresponding event on the basis of node added or node reconnected state. # Node added event : Again here 2 scenario's can occur ## New node is registering with different ip:port – NOT A PROBLEM ## Old node is re-registering because of RESYNC command from RM when RM restart – NOT A PROBLEM # Node reconnected event : ## Existing node is re-registering i.e RM treat it as reconnecting node when RM is not restarted ### NM RESTART NOT Enabled – NOT A PROBLEM ### NM RESTART is Enabled Some applications are running on this node – *Problem is here* Zero applications are running on this node – NOT A PROBLEM Since NMContainerStatus are not handled, RM never get to know about completedContainer and never release resource held be containers. RM will not allocate new containers for pending resource request as long as the completedContainer event is triggered. This results in applications to wait indefinitly because of pending containers are not served by RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3194) RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node
[ https://issues.apache.org/jira/browse/YARN-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14332224#comment-14332224 ] Hudson commented on YARN-3194: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2062 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2062/]) YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node --- Key: YARN-3194 URL: https://issues.apache.org/jira/browse/YARN-3194 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: NM restart is enabled Reporter: Rohith Assignee: Rohith Priority: Blocker Fix For: 2.7.0 Attachments: 0001-YARN-3194.patch, 0001-yarn-3194-v1.patch On NM restart ,NM sends all the outstanding NMContainerStatus to RM during registration. The registration can be treated by RM as New node or Reconnecting node. RM triggers corresponding event on the basis of node added or node reconnected state. # Node added event : Again here 2 scenario's can occur ## New node is registering with different ip:port – NOT A PROBLEM ## Old node is re-registering because of RESYNC command from RM when RM restart – NOT A PROBLEM # Node reconnected event : ## Existing node is re-registering i.e RM treat it as reconnecting node when RM is not restarted ### NM RESTART NOT Enabled – NOT A PROBLEM ### NM RESTART is Enabled Some applications are running on this node – *Problem is here* Zero applications are running on this node – NOT A PROBLEM Since NMContainerStatus are not handled, RM never get to know about completedContainer and never release resource held be containers. RM will not allocate new containers for pending resource request as long as the completedContainer event is triggered. This results in applications to wait indefinitly because of pending containers are not served by RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3194) RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node
[ https://issues.apache.org/jira/browse/YARN-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330310#comment-14330310 ] Hudson commented on YARN-3194: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #102 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/102/]) YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/CHANGES.txt RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node --- Key: YARN-3194 URL: https://issues.apache.org/jira/browse/YARN-3194 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: NM restart is enabled Reporter: Rohith Assignee: Rohith Priority: Blocker Fix For: 2.7.0 Attachments: 0001-YARN-3194.patch, 0001-yarn-3194-v1.patch On NM restart ,NM sends all the outstanding NMContainerStatus to RM during registration. The registration can be treated by RM as New node or Reconnecting node. RM triggers corresponding event on the basis of node added or node reconnected state. # Node added event : Again here 2 scenario's can occur ## New node is registering with different ip:port – NOT A PROBLEM ## Old node is re-registering because of RESYNC command from RM when RM restart – NOT A PROBLEM # Node reconnected event : ## Existing node is re-registering i.e RM treat it as reconnecting node when RM is not restarted ### NM RESTART NOT Enabled – NOT A PROBLEM ### NM RESTART is Enabled Some applications are running on this node – *Problem is here* Zero applications are running on this node – NOT A PROBLEM Since NMContainerStatus are not handled, RM never get to know about completedContainer and never release resource held be containers. RM will not allocate new containers for pending resource request as long as the completedContainer event is triggered. This results in applications to wait indefinitly because of pending containers are not served by RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3194) RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node
[ https://issues.apache.org/jira/browse/YARN-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330289#comment-14330289 ] Hudson commented on YARN-3194: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2043 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2043/]) YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node --- Key: YARN-3194 URL: https://issues.apache.org/jira/browse/YARN-3194 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: NM restart is enabled Reporter: Rohith Assignee: Rohith Priority: Blocker Fix For: 2.7.0 Attachments: 0001-YARN-3194.patch, 0001-yarn-3194-v1.patch On NM restart ,NM sends all the outstanding NMContainerStatus to RM during registration. The registration can be treated by RM as New node or Reconnecting node. RM triggers corresponding event on the basis of node added or node reconnected state. # Node added event : Again here 2 scenario's can occur ## New node is registering with different ip:port – NOT A PROBLEM ## Old node is re-registering because of RESYNC command from RM when RM restart – NOT A PROBLEM # Node reconnected event : ## Existing node is re-registering i.e RM treat it as reconnecting node when RM is not restarted ### NM RESTART NOT Enabled – NOT A PROBLEM ### NM RESTART is Enabled Some applications are running on this node – *Problem is here* Zero applications are running on this node – NOT A PROBLEM Since NMContainerStatus are not handled, RM never get to know about completedContainer and never release resource held be containers. RM will not allocate new containers for pending resource request as long as the completedContainer event is triggered. This results in applications to wait indefinitly because of pending containers are not served by RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3194) RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node
[ https://issues.apache.org/jira/browse/YARN-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330150#comment-14330150 ] Hudson commented on YARN-3194: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #111 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/111/]) YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/CHANGES.txt RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node --- Key: YARN-3194 URL: https://issues.apache.org/jira/browse/YARN-3194 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: NM restart is enabled Reporter: Rohith Assignee: Rohith Priority: Blocker Fix For: 2.7.0 Attachments: 0001-YARN-3194.patch, 0001-yarn-3194-v1.patch On NM restart ,NM sends all the outstanding NMContainerStatus to RM during registration. The registration can be treated by RM as New node or Reconnecting node. RM triggers corresponding event on the basis of node added or node reconnected state. # Node added event : Again here 2 scenario's can occur ## New node is registering with different ip:port – NOT A PROBLEM ## Old node is re-registering because of RESYNC command from RM when RM restart – NOT A PROBLEM # Node reconnected event : ## Existing node is re-registering i.e RM treat it as reconnecting node when RM is not restarted ### NM RESTART NOT Enabled – NOT A PROBLEM ### NM RESTART is Enabled Some applications are running on this node – *Problem is here* Zero applications are running on this node – NOT A PROBLEM Since NMContainerStatus are not handled, RM never get to know about completedContainer and never release resource held be containers. RM will not allocate new containers for pending resource request as long as the completedContainer event is triggered. This results in applications to wait indefinitly because of pending containers are not served by RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3194) RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node
[ https://issues.apache.org/jira/browse/YARN-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330180#comment-14330180 ] Hudson commented on YARN-3194: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #845 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/845/]) YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node --- Key: YARN-3194 URL: https://issues.apache.org/jira/browse/YARN-3194 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: NM restart is enabled Reporter: Rohith Assignee: Rohith Priority: Blocker Fix For: 2.7.0 Attachments: 0001-YARN-3194.patch, 0001-yarn-3194-v1.patch On NM restart ,NM sends all the outstanding NMContainerStatus to RM during registration. The registration can be treated by RM as New node or Reconnecting node. RM triggers corresponding event on the basis of node added or node reconnected state. # Node added event : Again here 2 scenario's can occur ## New node is registering with different ip:port – NOT A PROBLEM ## Old node is re-registering because of RESYNC command from RM when RM restart – NOT A PROBLEM # Node reconnected event : ## Existing node is re-registering i.e RM treat it as reconnecting node when RM is not restarted ### NM RESTART NOT Enabled – NOT A PROBLEM ### NM RESTART is Enabled Some applications are running on this node – *Problem is here* Zero applications are running on this node – NOT A PROBLEM Since NMContainerStatus are not handled, RM never get to know about completedContainer and never release resource held be containers. RM will not allocate new containers for pending resource request as long as the completedContainer event is triggered. This results in applications to wait indefinitly because of pending containers are not served by RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3194) RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node
[ https://issues.apache.org/jira/browse/YARN-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329048#comment-14329048 ] Hudson commented on YARN-3194: -- FAILURE: Integrated in Hadoop-trunk-Commit #7162 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7162/]) YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node. Contributed by Rohith (jlowe: rev a64dd3d24bfcb9af21eb63869924f6482b147fd3) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java RM should handle NMContainerStatuses sent by NM while registering if NM is Reconnected node --- Key: YARN-3194 URL: https://issues.apache.org/jira/browse/YARN-3194 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: NM restart is enabled Reporter: Rohith Assignee: Rohith Priority: Blocker Fix For: 2.7.0 Attachments: 0001-YARN-3194.patch, 0001-yarn-3194-v1.patch On NM restart ,NM sends all the outstanding NMContainerStatus to RM during registration. The registration can be treated by RM as New node or Reconnecting node. RM triggers corresponding event on the basis of node added or node reconnected state. # Node added event : Again here 2 scenario's can occur ## New node is registering with different ip:port – NOT A PROBLEM ## Old node is re-registering because of RESYNC command from RM when RM restart – NOT A PROBLEM # Node reconnected event : ## Existing node is re-registering i.e RM treat it as reconnecting node when RM is not restarted ### NM RESTART NOT Enabled – NOT A PROBLEM ### NM RESTART is Enabled Some applications are running on this node – *Problem is here* Zero applications are running on this node – NOT A PROBLEM Since NMContainerStatus are not handled, RM never get to know about completedContainer and never release resource held be containers. RM will not allocate new containers for pending resource request as long as the completedContainer event is triggered. This results in applications to wait indefinitly because of pending containers are not served by RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)