[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203391#comment-14203391
 ] 

Hudson commented on YARN-2825:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #737 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/737/])
YARN-2825. Container leak on NM. Contributed by Jian He (jlowe: rev 
c3d475070a1ec54c4b05923f4782cef204effd2c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java


 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Fix For: 2.6.0

 Attachments: YARN-2825.1.patch, YARN-2825.1.patch, YARN-2825.2.patch, 
 YARN-2825.3.patch, YARN-2825.4.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203436#comment-14203436
 ] 

Hudson commented on YARN-2825:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1927 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1927/])
YARN-2825. Container leak on NM. Contributed by Jian He (jlowe: rev 
c3d475070a1ec54c4b05923f4782cef204effd2c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* hadoop-yarn-project/CHANGES.txt


 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Fix For: 2.6.0

 Attachments: YARN-2825.1.patch, YARN-2825.1.patch, YARN-2825.2.patch, 
 YARN-2825.3.patch, YARN-2825.4.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-07 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202287#comment-14202287
 ] 

Jason Lowe commented on YARN-2825:
--

Thanks for the patch, Jian!

Is there a reason we need to cast to ContainerImpl?  I think calling 
context.getContainers().get(containerId).getContainerState() == 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerState.DONE
 would be equivalent and cleaner since we wouldn't assume the container 
implementation.  Or we could get the container status and check for COMPLETE 
which is what other parts of the code are doing.

 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Attachments: YARN-2825.1.patch, YARN-2825.1.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-07 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202422#comment-14202422
 ] 

Jian He commented on YARN-2825:
---

Thanks Jason for reviewing the patch !
bq. Or we could get the container status and check for COMPLETE which is what 
other parts of the code are doing
the existing method is doing {{cloneAndGetContainerStatus}}, I just wanted to 
avoid the clone part. 
To conform with the rest of the code and also avoid doing the casting, I 
exposed the getCurrentState in the interface, as I saw there's another caller 
needs to do the same.

 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Attachments: YARN-2825.1.patch, YARN-2825.1.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-07 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202521#comment-14202521
 ] 

Jason Lowe commented on YARN-2825:
--

Curious, what's the reasoning to avoid checking for the DONE state?  That's 
utilizing an existing interface rather than adding a new one.  It's confusing 
to have two get state methods (current vs. container, but it _is_ a container 
so...), although I realize it derives from the undesirable situation of having 
two ContainerState types.  I suppose we could also just expose an isComplete() 
predicate method since callers are only checking for COMPLETE.

That being said I'm not totally against adding the yarn.api.records state in a 
new method, just think it's not really necessary.  If we continue along that 
route then ContainerImpl.getCurrentState needs the Override decorator.  Also 
ContainerImpl is now an usused import, and should continue to be unnecessary 
regardless of which route we pursue.


 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Attachments: YARN-2825.1.patch, YARN-2825.1.patch, YARN-2825.2.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-07 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202530#comment-14202530
 ] 

Jian He commented on YARN-2825:
---

bq. what's the reasoning to avoid checking for the DONE state?
The only reason is to consistent with the check in 
{{NodeStatusUpdaterImpl#getContainerStatuses}} where it's checking 
ContainerState.COMPLETE rather than DONE. i can just change it check DONE 
instead of adding a new method. updating the patch.

 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Attachments: YARN-2825.1.patch, YARN-2825.1.patch, YARN-2825.2.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202578#comment-14202578
 ] 

Hadoop QA commented on YARN-2825:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680218/YARN-2825.2.patch
  against trunk revision 2ac1be7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1219 javac 
compiler warnings (more than the trunk's current 153 warnings).

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 9 
warning messages.
See 
https://builds.apache.org/job/PreCommit-YARN-Build/5782//artifact/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.resourcetracker.TestRMNMRPCResponseId

  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerQueueACLs

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5782//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/5782//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5782//console

This message is automatically generated.

 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Attachments: YARN-2825.1.patch, YARN-2825.1.patch, YARN-2825.2.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-07 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202684#comment-14202684
 ] 

Jian He commented on YARN-2825:
---

[~jlowe], mind take a look at the new patch ?

 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Attachments: YARN-2825.1.patch, YARN-2825.1.patch, YARN-2825.2.patch, 
 YARN-2825.3.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202765#comment-14202765
 ] 

Hadoop QA commented on YARN-2825:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680267/YARN-2825.3.patch
  against trunk revision 06b7979.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5792//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5792//console

This message is automatically generated.

 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Attachments: YARN-2825.1.patch, YARN-2825.1.patch, YARN-2825.2.patch, 
 YARN-2825.3.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-07 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202905#comment-14202905
 ] 

Jason Lowe commented on YARN-2825:
--

+1 lgtm.  Committing this.


 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Attachments: YARN-2825.1.patch, YARN-2825.1.patch, YARN-2825.2.patch, 
 YARN-2825.3.patch, YARN-2825.4.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202919#comment-14202919
 ] 

Hudson commented on YARN-2825:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6487 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6487/])
YARN-2825. Container leak on NM. Contributed by Jian He (jlowe: rev 
c3d475070a1ec54c4b05923f4782cef204effd2c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java


 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Attachments: YARN-2825.1.patch, YARN-2825.1.patch, YARN-2825.2.patch, 
 YARN-2825.3.patch, YARN-2825.4.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-07 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202935#comment-14202935
 ] 

Jian He commented on YARN-2825:
---

Jason, thanks for review and committing!

 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Fix For: 2.6.0

 Attachments: YARN-2825.1.patch, YARN-2825.1.patch, YARN-2825.2.patch, 
 YARN-2825.3.patch, YARN-2825.4.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202899#comment-14202899
 ] 

Hadoop QA commented on YARN-2825:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680289/YARN-2825.4.patch
  against trunk revision 4cfd5bc.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5795//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5795//console

This message is automatically generated.

 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Attachments: YARN-2825.1.patch, YARN-2825.1.patch, YARN-2825.2.patch, 
 YARN-2825.3.patch, YARN-2825.4.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2825) Container leak on NM

2014-11-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201530#comment-14201530
 ] 

Hadoop QA commented on YARN-2825:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12680068/YARN-2825.1.patch
  against trunk revision ba0a42c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5769//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5769//console

This message is automatically generated.

 Container leak on NM
 

 Key: YARN-2825
 URL: https://issues.apache.org/jira/browse/YARN-2825
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Attachments: YARN-2825.1.patch, YARN-2825.1.patch


 Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
 The problem is that in YARN-1372 we changed the behavior to remove containers 
 from NMContext only after the containers are acknowledged  by AM. But in the 
 {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}} call, we 
 didn't check whether the container is really completed or not.  If the 
 container is stilll running, we shouldn't remove the container from the 
 context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)