subject:"\[jira\] \[Commented\] \(YARN\-2617\) NM does not need to send finished container whose APP is not running to RM"

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-10 Thread Hudson (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166681#comment-14166681
]

Hudson commented on YARN-2617:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #707 (See
[https://builds.apache.org/job/Hadoop-Yarn-trunk/707/])
YARN-2617. Fixed ApplicationSubmissionContext to still set resource for
backward compatibility. Contributed by Wangda Tan. (zjshen: rev
e532ed8faa8db4b008a5b8d3f82b48a1b314fa6c)
* hadoop-yarn-project/CHANGES.txt
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.java
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestPBImplRecords.java

NM does not need to send finished container whose APP is not running to RM
--

Key: YARN-2617
URL: https://issues.apache.org/jira/browse/YARN-2617
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
Fix For: 2.6.0

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.6.patch,
YARN-2617.patch

We([~chenchun]) are testing RM work preserving restart and found the
following logs when we ran a simple MapReduce task PI. NM continuously
reported completed containers whose Application had already finished while AM
had finished.
{code}
2014-09-26 17:00:42,228 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Null container completed...
2014-09-26 17:00:42,228 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Null container completed...
2014-09-26 17:00:43,230 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Null container completed...
2014-09-26 17:00:43,230 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Null container completed...
2014-09-26 17:00:44,233 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Null container completed...
2014-09-26 17:00:44,233 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Null container completed...
{code}
In the patch for YARN-1372, ApplicationImpl on NM should guarantee to clean
up already completed applications. But it will only remove appId from
'app.context.getApplications()' when ApplicaitonImpl received evnet
'ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED' , however NM might
receive this event for a long time or could not receive.
* For NonAggregatingLogHandler, it wait for
YarnConfiguration.NM_LOG_RETAIN_SECONDS which is 3 * 60 * 60 sec by default,
then it will be scheduled to delete Application logs and send the event.
* For LogAggregationService, it might fail(e.g. if user does not have HDFS
write permission), and it will not send the event.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-10 Thread Hudson (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166873#comment-14166873
]

Hudson commented on YARN-2617:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1897 (See
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1897/])
YARN-2617. Fixed ApplicationSubmissionContext to still set resource for
backward compatibility. Contributed by Wangda Tan. (zjshen: rev
e532ed8faa8db4b008a5b8d3f82b48a1b314fa6c)
* hadoop-yarn-project/CHANGES.txt
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestPBImplRecords.java
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.java

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.6.patch,
YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-10 Thread Hudson (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166962#comment-14166962
]

Hudson commented on YARN-2617:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1922 (See
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1922/])
YARN-2617. Fixed ApplicationSubmissionContext to still set resource for
backward compatibility. Contributed by Wangda Tan. (zjshen: rev
e532ed8faa8db4b008a5b8d3f82b48a1b314fa6c)
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestPBImplRecords.java
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.java
* hadoop-yarn-project/CHANGES.txt

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.6.patch,
YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-09 Thread Hudson (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166104#comment-14166104
]

Hudson commented on YARN-2617:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6230 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/6230/])
YARN-2617. Fixed ApplicationSubmissionContext to still set resource for
backward compatibility. Contributed by Wangda Tan. (zjshen: rev
e532ed8faa8db4b008a5b8d3f82b48a1b314fa6c)
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestPBImplRecords.java
* hadoop-yarn-project/CHANGES.txt
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.java

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.6.patch,
YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-03 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157901#comment-14157901
 ] 

Hudson commented on YARN-2617:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #699 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/699/])
YARN-2617. Fixed NM to not send duplicate container status whose app is not 
running. Contributed by Jun Gong (jianhe: rev 
3ef1cf187faeb530e74606dd7113fd1ba08140d7)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java


 NM does not need to send finished container whose APP is not running to RM
 --

 Key: YARN-2617
 URL: https://issues.apache.org/jira/browse/YARN-2617
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Fix For: 2.6.0

 Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch, 
 YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.6.patch, 
 YARN-2617.patch


 We([~chenchun]) are testing RM work preserving restart and found the 
 following logs when we ran a simple MapReduce task PI. NM continuously 
 reported completed containers whose Application had already finished while AM 
 had finished. 
 {code}
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 {code}
 In the patch for YARN-1372, ApplicationImpl on NM should guarantee to  clean 
 up already completed applications. But it will only remove appId from  
 'app.context.getApplications()' when ApplicaitonImpl received evnet 
 'ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED' , however NM might 
 receive this event for a long time or could not receive. 
 * For NonAggregatingLogHandler, it wait for 
 YarnConfiguration.NM_LOG_RETAIN_SECONDS which is 3 * 60 * 60 sec by default, 
 then it will be scheduled to delete Application logs and send the event.
 * For LogAggregationService, it might fail(e.g. if user does not have HDFS 
 write permission), and it will not send the event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-03 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14158014#comment-14158014
 ] 

Hudson commented on YARN-2617:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1890 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1890/])
YARN-2617. Fixed NM to not send duplicate container status whose app is not 
running. Contributed by Jun Gong (jianhe: rev 
3ef1cf187faeb530e74606dd7113fd1ba08140d7)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java


 NM does not need to send finished container whose APP is not running to RM
 --

 Key: YARN-2617
 URL: https://issues.apache.org/jira/browse/YARN-2617
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Fix For: 2.6.0

 Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch, 
 YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.6.patch, 
 YARN-2617.patch


 We([~chenchun]) are testing RM work preserving restart and found the 
 following logs when we ran a simple MapReduce task PI. NM continuously 
 reported completed containers whose Application had already finished while AM 
 had finished. 
 {code}
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 {code}
 In the patch for YARN-1372, ApplicationImpl on NM should guarantee to  clean 
 up already completed applications. But it will only remove appId from  
 'app.context.getApplications()' when ApplicaitonImpl received evnet 
 'ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED' , however NM might 
 receive this event for a long time or could not receive. 
 * For NonAggregatingLogHandler, it wait for 
 YarnConfiguration.NM_LOG_RETAIN_SECONDS which is 3 * 60 * 60 sec by default, 
 then it will be scheduled to delete Application logs and send the event.
 * For LogAggregationService, it might fail(e.g. if user does not have HDFS 
 write permission), and it will not send the event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-03 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14158063#comment-14158063
 ] 

Hudson commented on YARN-2617:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1915 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1915/])
YARN-2617. Fixed NM to not send duplicate container status whose app is not 
running. Contributed by Jun Gong (jianhe: rev 
3ef1cf187faeb530e74606dd7113fd1ba08140d7)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* hadoop-yarn-project/CHANGES.txt


 NM does not need to send finished container whose APP is not running to RM
 --

 Key: YARN-2617
 URL: https://issues.apache.org/jira/browse/YARN-2617
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Fix For: 2.6.0

 Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch, 
 YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.6.patch, 
 YARN-2617.patch


 We([~chenchun]) are testing RM work preserving restart and found the 
 following logs when we ran a simple MapReduce task PI. NM continuously 
 reported completed containers whose Application had already finished while AM 
 had finished. 
 {code}
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 {code}
 In the patch for YARN-1372, ApplicationImpl on NM should guarantee to  clean 
 up already completed applications. But it will only remove appId from  
 'app.context.getApplications()' when ApplicaitonImpl received evnet 
 'ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED' , however NM might 
 receive this event for a long time or could not receive. 
 * For NonAggregatingLogHandler, it wait for 
 YarnConfiguration.NM_LOG_RETAIN_SECONDS which is 3 * 60 * 60 sec by default, 
 then it will be scheduled to delete Application logs and send the event.
 * For LogAggregationService, it might fail(e.g. if user does not have HDFS 
 write permission), and it will not send the event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-02 Thread Jun Gong (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156198#comment-14156198
]

Jun Gong commented on YARN-2617:

I investigated why TestDirectoryCollection failed. And it might be because of
YARN-2640. Could you help check and review it please? Thank you.

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.6.patch,
YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-02 Thread Jian He (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156761#comment-14156761
]

Jian He commented on YARN-2617:
---

YARN-2640 seems resolved in YARN-1979 already.

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.6.patch,
YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156772#comment-14156772
 ] 

Hudson commented on YARN-2617:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6176 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6176/])
YARN-2617. Fixed NM to not send duplicate container status whose app is not 
running. Contributed by Jun Gong (jianhe: rev 
3ef1cf187faeb530e74606dd7113fd1ba08140d7)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java


 NM does not need to send finished container whose APP is not running to RM
 --

 Key: YARN-2617
 URL: https://issues.apache.org/jira/browse/YARN-2617
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Fix For: 2.6.0

 Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch, 
 YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.6.patch, 
 YARN-2617.patch


 We([~chenchun]) are testing RM work preserving restart and found the 
 following logs when we ran a simple MapReduce task PI. NM continuously 
 reported completed containers whose Application had already finished while AM 
 had finished. 
 {code}
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 {code}
 In the patch for YARN-1372, ApplicationImpl on NM should guarantee to  clean 
 up already completed applications. But it will only remove appId from  
 'app.context.getApplications()' when ApplicaitonImpl received evnet 
 'ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED' , however NM might 
 receive this event for a long time or could not receive. 
 * For NonAggregatingLogHandler, it wait for 
 YarnConfiguration.NM_LOG_RETAIN_SECONDS which is 3 * 60 * 60 sec by default, 
 then it will be scheduled to delete Application logs and send the event.
 * For LogAggregationService, it might fail(e.g. if user does not have HDFS 
 write permission), and it will not send the event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-01 Thread Jian He (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155220#comment-14155220
]

Jian He commented on YARN-2617:
---

looks good, +1

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-01 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155477#comment-14155477
]

Hadoop QA commented on YARN-2617:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12672391/YARN-2617.5.patch
against trunk revision 1f5b42a.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 1 new
or modified test files.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}. The applied patch generated 1
release audit warnings.

{color:red}-1 core tests{color}. The patch failed these unit tests in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerResync

org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManager
org.apache.hadoop.yarn.server.nodemanager.TestEventFlow

org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor

org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerReboot

{color:green}+1 contrib tests{color}. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/5205//testReport/
Release audit warnings:
https://builds.apache.org/job/PreCommit-YARN-Build/5205//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5205//console

This message is automatically generated.

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.5.patch, YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155715#comment-14155715
 ] 

Hadoop QA commented on YARN-2617:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12672391/YARN-2617.5.patch
  against trunk revision dd1b8f2.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5212//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5212//console

This message is automatically generated.

 NM does not need to send finished container whose APP is not running to RM
 --

 Key: YARN-2617
 URL: https://issues.apache.org/jira/browse/YARN-2617
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Fix For: 2.6.0

 Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch, 
 YARN-2617.5.patch, YARN-2617.patch


 We([~chenchun]) are testing RM work preserving restart and found the 
 following logs when we ran a simple MapReduce task PI. NM continuously 
 reported completed containers whose Application had already finished while AM 
 had finished. 
 {code}
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 {code}
 In the patch for YARN-1372, ApplicationImpl on NM should guarantee to  clean 
 up already completed applications. But it will only remove appId from  
 'app.context.getApplications()' when ApplicaitonImpl received evnet 
 'ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED' , however NM might 
 receive this event for a long time or could not receive. 
 * For NonAggregatingLogHandler, it wait for 
 YarnConfiguration.NM_LOG_RETAIN_SECONDS which is 3 * 60 * 60 sec by default, 
 then it will be scheduled to delete Application logs and send the event.
 * For LogAggregationService, it might fail(e.g. if user does not have HDFS 
 write permission), and it will not send the event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155766#comment-14155766
 ] 

Hadoop QA commented on YARN-2617:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12672391/YARN-2617.5.patch
  against trunk revision 52bbe0f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5218//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5218//console

This message is automatically generated.

 NM does not need to send finished container whose APP is not running to RM
 --

 Key: YARN-2617
 URL: https://issues.apache.org/jira/browse/YARN-2617
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Fix For: 2.6.0

 Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch, 
 YARN-2617.5.patch, YARN-2617.patch


 We([~chenchun]) are testing RM work preserving restart and found the 
 following logs when we ran a simple MapReduce task PI. NM continuously 
 reported completed containers whose Application had already finished while AM 
 had finished. 
 {code}
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 {code}
 In the patch for YARN-1372, ApplicationImpl on NM should guarantee to  clean 
 up already completed applications. But it will only remove appId from  
 'app.context.getApplications()' when ApplicaitonImpl received evnet 
 'ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED' , however NM might 
 receive this event for a long time or could not receive. 
 * For NonAggregatingLogHandler, it wait for 
 YarnConfiguration.NM_LOG_RETAIN_SECONDS which is 3 * 60 * 60 sec by default, 
 then it will be scheduled to delete Application logs and send the event.
 * For LogAggregationService, it might fail(e.g. if user does not have HDFS 
 write permission), and it will not send the event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155880#comment-14155880
 ] 

Hadoop QA commented on YARN-2617:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12672458/YARN-2617.5.patch
  against trunk revision 0708827.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.TestDirectoryCollection

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5222//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5222//console

This message is automatically generated.

 NM does not need to send finished container whose APP is not running to RM
 --

 Key: YARN-2617
 URL: https://issues.apache.org/jira/browse/YARN-2617
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Fix For: 2.6.0

 Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch, 
 YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.patch


 We([~chenchun]) are testing RM work preserving restart and found the 
 following logs when we ran a simple MapReduce task PI. NM continuously 
 reported completed containers whose Application had already finished while AM 
 had finished. 
 {code}
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 {code}
 In the patch for YARN-1372, ApplicationImpl on NM should guarantee to  clean 
 up already completed applications. But it will only remove appId from  
 'app.context.getApplications()' when ApplicaitonImpl received evnet 
 'ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED' , however NM might 
 receive this event for a long time or could not receive. 
 * For NonAggregatingLogHandler, it wait for 
 YarnConfiguration.NM_LOG_RETAIN_SECONDS which is 3 * 60 * 60 sec by default, 
 then it will be scheduled to delete Application logs and send the event.
 * For LogAggregationService, it might fail(e.g. if user does not have HDFS 
 write permission), and it will not send the event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-01 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155912#comment-14155912
]

Hadoop QA commented on YARN-2617:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12672458/YARN-2617.5.patch
against trunk revision 0708827.

{color:red}-1 patch{color}. Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5225//console

This message is automatically generated.

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-01 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155941#comment-14155941
]

Hadoop QA commented on YARN-2617:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12672468/YARN-2617.5.patch
against trunk revision 9e40de6.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 1 new
or modified test files.

{color:red}-1 javac{color:red}. The patch appears to cause the build to
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5227//console

This message is automatically generated.

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-01 Thread Jun Gong (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155960#comment-14155960
]

Jun Gong commented on YARN-2617:

It seems that there is something wrong with Jenkins.

From the console output
https://builds.apache.org/job/PreCommit-YARN-Build/5227//console, it seems to
apply a wrong patch.
quote
Going to apply patch with: /usr/bin/patch -p0
patching file
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestClientToAMTokens.java
quote

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.6.patch,
YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155992#comment-14155992
 ] 

Hadoop QA commented on YARN-2617:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12672474/YARN-2617.6.patch
  against trunk revision 9e40de6.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.TestDirectoryCollection

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5229//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5229//console

This message is automatically generated.

 NM does not need to send finished container whose APP is not running to RM
 --

 Key: YARN-2617
 URL: https://issues.apache.org/jira/browse/YARN-2617
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Fix For: 2.6.0

 Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch, 
 YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.5.patch, YARN-2617.6.patch, 
 YARN-2617.patch


 We([~chenchun]) are testing RM work preserving restart and found the 
 following logs when we ran a simple MapReduce task PI. NM continuously 
 reported completed containers whose Application had already finished while AM 
 had finished. 
 {code}
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 {code}
 In the patch for YARN-1372, ApplicationImpl on NM should guarantee to  clean 
 up already completed applications. But it will only remove appId from  
 'app.context.getApplications()' when ApplicaitonImpl received evnet 
 'ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED' , however NM might 
 receive this event for a long time or could not receive. 
 * For NonAggregatingLogHandler, it wait for 
 YarnConfiguration.NM_LOG_RETAIN_SECONDS which is 3 * 60 * 60 sec by default, 
 then it will be scheduled to delete Application logs and send the event.
 * For LogAggregationService, it might fail(e.g. if user does not have HDFS 
 write permission), and it will not send the event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-30 Thread Jian He (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154128#comment-14154128
]

Jian He commented on YARN-2617:
---

Thanks for updating !
- {{containerStatuses.add(status);}} is moved after this check
{{status.getContainerState() == ContainerState.COMPLETE}}. In some cases(e.g.
NM decommission), I think we still need to send the completeContainers across
so that RM knows this container completes.
- we may not need to change {{getNMContainerStatuses}}, as this method will be
invoked only once on re-register. I’m afraid not sending the whole containers
for recovery will hit some other race conditions.

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-30 Thread Jian He (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154334#comment-14154334
]

Jian He commented on YARN-2617:
---

bq. send completed container one time even its corresponding Application is
stopped then delete it from context.getContainers()
yep, because if we gracefully decommission a node, we also need to notify RM
that the containers running on this node completes.
btw. once you uploaded a patch, you can click the submit patch, which will
trigger jenkins to run the corresponding unit tests.

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-30 Thread Jun Gong (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154337#comment-14154337
]

Jun Gong commented on YARN-2617:

Get it. Thank you!

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch,
YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154369#comment-14154369
 ] 

Hadoop QA commented on YARN-2617:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12672239/YARN-2617.4.patch
  against trunk revision 17d1202.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5193//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5193//console

This message is automatically generated.

 NM does not need to send finished container whose APP is not running to RM
 --

 Key: YARN-2617
 URL: https://issues.apache.org/jira/browse/YARN-2617
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Fix For: 2.6.0

 Attachments: YARN-2617.2.patch, YARN-2617.3.patch, YARN-2617.4.patch, 
 YARN-2617.patch


 We([~chenchun]) are testing RM work preserving restart and found the 
 following logs when we ran a simple MapReduce task PI. NM continuously 
 reported completed containers whose Application had already finished while AM 
 had finished. 
 {code}
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 {code}
 In the patch for YARN-1372, ApplicationImpl on NM should guarantee to  clean 
 up already completed applications. But it will only remove appId from  
 'app.context.getApplications()' when ApplicaitonImpl received evnet 
 'ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED' , however NM might 
 receive this event for a long time or could not receive. 
 * For NonAggregatingLogHandler, it wait for 
 YarnConfiguration.NM_LOG_RETAIN_SECONDS which is 3 * 60 * 60 sec by default, 
 then it will be scheduled to delete Application logs and send the event.
 * For LogAggregationService, it might fail(e.g. if user does not have HDFS 
 write permission), and it will not send the event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-29 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151393#comment-14151393
 ] 

Jian He commented on YARN-2617:
---

[~hex108], thanks for reporting this issue !
- Took a look at the patch. For Apps at New/Init state, we still should not 
prematurely remove the containers from context. I think we should explicitly 
check if apps  are at 
FINISHING_CONTAINERS_WAIT/APPLICATION_RESOURCES_CLEANINGUP/FINISHED state. If 
they are, we are safe to remove the containers from context.
- Also, the following code
{code}
  if (!this.context.getApplications().containsKey(applicationId)) {
context.getContainers().remove(containerId);
continue;
  }
{code}
 needs to be moved inside the following check {{ if 
(containerStatus.getState().equals(ContainerState.COMPLETE))}}, so that if app 
is at one of the states mentioned above, we don't remove containers from the 
context until container reaches COMPLETE state. Does this make sense to you ?
- Could you add unit test for your change too ?

 NM does not need to send finished container whose APP is not running to RM
 --

 Key: YARN-2617
 URL: https://issues.apache.org/jira/browse/YARN-2617
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
 Fix For: 2.6.0

 Attachments: YARN-2617.patch


 We([~chenchun]) are testing RM work preserving restart and found the 
 following logs when we ran a simple MapReduce task PI. NM continuously 
 reported completed containers whose Application had already finished while AM 
 had finished. 
 {code}
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:42,228 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:43,230 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 2014-09-26 17:00:44,233 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 {code}
 In the patch for YARN-1372, ApplicationImpl on NM should guarantee to  clean 
 up already completed applications. But it will only remove appId from  
 'app.context.getApplications()' when ApplicaitonImpl received evnet 
 'ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED' , however NM might 
 receive this event for a long time or could not receive. 
 * For NonAggregatingLogHandler, it wait for 
 YarnConfiguration.NM_LOG_RETAIN_SECONDS which is 3 * 60 * 60 sec by default, 
 then it will be scheduled to delete Application logs and send the event.
 * For LogAggregationService, it might fail(e.g. if user does not have HDFS 
 write permission), and it will not send the event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-29 Thread Jian He (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151396#comment-14151396
]

Jian He commented on YARN-2617:
---

I just added you to the contributor list, thanks again for your contribution !

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-29 Thread Jun Gong (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151466#comment-14151466
]

Jun Gong commented on YARN-2617:

[~jianhe], thank you for the review!

{quote}
I think we should explicitly check if apps are at
FINISHING_CONTAINERS_WAIT/APPLICATION_RESOURCES_CLEANINGUP/FINISHED state.
{quote}
My concern is that we will need to modify the code when we add a new state for
ApplicationImpl. It will be OK if it is not a problem. BTW: is there any case
that APP has containers but APP is not in RUNNING state?

{quote}
The code needs to be moved inside the following check {{ if
(containerStatus.getState().equals(ContainerState.COMPLETE))}} ...
{quote}
OK. I will change it.

And I will add an unit test.

NM does not need to send finished container whose APP is not running to RM
--

Attachments: YARN-2617.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

26 matches

Site Navigation

Mail list logo

Footer information