[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2017-03-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15907618#comment-15907618
 ] 

Hadoop QA commented on YARN-4051:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 21s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 153 unchanged - 1 fixed = 154 total (was 154) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
17s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager
 generated 1 new + 230 unchanged - 0 fixed = 231 total (was 230) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 13m 31s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManagerRecovery |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-4051 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12857685/YARN-4051.07.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0694ad23f830 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 7992426 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/15243/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/15243/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/15243/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 

[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2017-03-13 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15907544#comment-15907544
 ] 

sandflee commented on YARN-4051:


Thanks [~jlowe],  
bq. I'm also wondering about the scenario where the kill event is coming in 
from an AM and not the RM. 
simple throw a YarnException when AM stops a recovering container, but seems 
NMClientAsyncImpl could't try stopContainer again, we could fix this in a new 
issue? 
{code}
.addTransition(ContainerState.RUNNING,
EnumSet.of(ContainerState.DONE, ContainerState.FAILED),
ContainerEventType.STOP_CONTAINER,
new StopContainerTransition())
{code}
do another two changes:
1, using app.handle(new ApplicationContainerInitEvent(container)) when recover 
containers, for there is a race condition when Finish events comes, 
ApplicationContainerInitEvent not processed and containers are not added to app
2, use ConcurrentHashMap to store containers in app. because I encountered 
ConcurrentModifyException when iterating app.getContainers() , and I also see 
web and AppLogAggregator using app.getContainers() without protect.

> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
> YARN-4051.03.patch, YARN-4051.04.patch, YARN-4051.05.patch, 
> YARN-4051.06.patch, YARN-4051.07.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New 
> state, when we finish application, the container still alive even after NM 
> event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2017-03-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901553#comment-15901553
 ] 

Jason Lowe commented on YARN-4051:
--

Thanks for updating the patch!  In the future, please don't delete patches and 
re-upload them with the same name.  It can lead to very confusing cases where 
Jenkins comments on a patch that happens to have the same name as one of the 
current attachments but isn't actually the patch that was tested.

The following code won't actually cause it to ignore the FINISH_APPS event.  
The {{continue}} in the for loop is degenerate, so all this does is log 
warnings but otherwise is semantically the same logic:
{code}
for (Container container : app.getContainers().values()) {
  if (container.isRecovering()) {
LOG.warn("drop FINISH_APPS event to " + appID + "because container "
+ container.getContainerId() + "is recovering");
continue;
  }
}
{code}

Also this shouldn't be a warning since it's not actually wrong when this 
happens, correct?  Similarly the warn log when ignoring the FINISH_CONTAINERS 
event seems like that should just be an info log at best.

I'm also wondering about the scenario where the kill event is coming in from an 
AM and not the RM.  If a container is still in the recovering state when we 
open up the client service for new requests it seems a client (e.g.: AM) could 
come in and ask for a still-recovering container to be killed.  I think the 
container process will be orphaned if that occurs, since the NM will mistakenly 
believe the container has not been launched yet.

> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
> YARN-4051.03.patch, YARN-4051.04.patch, YARN-4051.05.patch, YARN-4051.06.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New 
> state, when we finish application, the container still alive even after NM 
> event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2017-03-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900624#comment-15900624
 ] 

Hadoop QA commented on YARN-4051:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 17s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 2 new + 140 unchanged - 1 fixed = 142 total (was 141) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m  
2s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-4051 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12856722/YARN-4051.06.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux fde589693e14 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 28daaf0 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/15201/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/15201/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/15201/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: 

[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2017-03-07 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900345#comment-15900345
 ] 

sandflee commented on YARN-4051:


since RM will resend FINISH_APPS/FINISH_CONTAINER if nm reports app/container 
running, seems safe to drop the event if container is recovering, [~jlowe]

> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
> YARN-4051.03.patch, YARN-4051.04.patch, YARN-4051.05.patch, YARN-4051.06.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New 
> state, when we finish application, the container still alive even after NM 
> event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-11-11 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001725#comment-15001725
 ] 

sandflee commented on YARN-4051:


thanks [~jlowe]

Should the value be infinite by default? The concern is that if one container 
has issues recovering (due to log aggregation woes or whatever) then we risk 
expiring all of the containers on this node if we don't re-register with the RM 
within the node expiry interval. I think it makes sense if we have also fixed 
the recovery paths so there aren't potentially long-running procedures (like 
contacting HDFS) during the recovery process. If we haven't then we could 
create as many problems as we're solving by waiting forever.
-- aggree ! I also concern this.

Why does the patch change the check interval? If it's to reduce the logging 
then we can better fix that by only logging when the status changes rather than 
every iteration.
---yes, it's to reduce the log, since recovery is almost very fast, change it 
back

 Nit: A value of zero should also be treated as a disabled max time.
--  zero is to register to register to rm at once whether nm complete recover 
or  not,yes?

Nit: "Max time to wait NM to complete container recover before register to RM " 
should be "Max time NM will wait to complete container recovery before 
registering with the RM".
-- corrected



> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
> YARN-4051.03.patch, YARN-4051.04.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New 
> state, when we finish application, the container still alive even after NM 
> event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-11-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001783#comment-15001783
 ] 

Hadoop QA commented on YARN-4051:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
11s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 19s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in 
trunk has 3 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 59s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 27s 
{color} | {color:red} Patch generated 1 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn (total was 265, now 265). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 11s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_60. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 4s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 0s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with 
JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 8s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_79. 

[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-11-11 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001309#comment-15001309
 ] 

Jason Lowe commented on YARN-4051:
--

Thanks for updating the patch!

Should the value be infinite by default?  The concern is that if one container 
has issues recovering (due to log aggregation woes or whatever) then we risk 
expiring all of the containers on this node if we don't re-register with the RM 
within the node expiry interval.  I think it makes sense if we have also fixed 
the recovery paths so there aren't potentially long-running  procedures (like 
contacting HDFS) during the recovery process.  If we haven't then we could 
create as many problems as we're solving by waiting forever.

Why does the patch change the check interval?  If it's to reduce the logging 
then we can better fix that by only logging when the status changes rather than 
every iteration.

Nit: A value of zero should also be treated as a disabled max time.

Nit: "Max time to wait NM to complete container recover before register to RM " 
should be "Max time NM will wait to complete container recovery before 
registering with the RM".

> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
> YARN-4051.03.patch, YARN-4051.04.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New 
> state, when we finish application, the container still alive even after NM 
> event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-11-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997974#comment-14997974
 ] 

Hadoop QA commented on YARN-4051:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
37s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 17s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in 
trunk has 3 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s 
{color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 45s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 25s 
{color} | {color:red} Patch generated 1 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn (total was 265, now 265). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 48s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 19s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_60. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 46s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m 50s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 2s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 23m 23s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 

[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-11-09 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996687#comment-14996687
 ] 

Jason Lowe commented on YARN-4051:
--

If I understand this correctly, we're saying that the problem described in 
YARN-4050 is holding up the main event dispatcher and the NM is semi-hung, yet 
we want to hurry and register with the ResourceManager before containers have 
recovered?  Seems to me we need to address the problem described in YARN-4050 
if possible (e.g.: skip HDFS operations if we recovered at least one container 
in the running or completed states since we know it must have done HDFS init in 
the previous NM instance).  Otherwise we are hacking around the fact that we 
registered too soon and aren't able to properly handle the out-of-order events. 
 I'd much rather deal with the root cause if possible than patch all the 
separate symptoms.

> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
> YARN-4051.03.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New 
> state, when we finish application, the container still alive even after NM 
> event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-11-09 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996345#comment-14996345
 ] 

sandflee commented on YARN-4051:


Is it possible for the finish application or complete container requests to 
arrive at this point?   
yes, we see this in YARN-4050.  If we register to RM after complete container 
recover, we must face the risk that the container running on this node will be 
killed if container recovery takes much more time(in YARN-4050), for 
long-runing-services, maybe not so perfect.

> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
> YARN-4051.03.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New 
> state, when we finish application, the container still alive even after NM 
> event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-11-02 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985299#comment-14985299
 ] 

Jason Lowe commented on YARN-4051:
--

bq. For RM finish application or complete container request, let RM retry, 
seems a little complicated,should we do that?
Is it possible for the finish application or complete container requests to 
arrive at this point?  We should not be registering with the RM until we've 
completed the container recovery process.  As such, it should be impossible to 
be told by the RM these things as we should not even be talking to it at that 
point.  Similarly, I believe the cleanest fix for the stop container request 
race is to avoid opening the client port until all the containers have 
recovered.  I know there's some issue there where we need to know the bind 
address of the client port during recovery but don't want to start listening on 
the port yet.  If the RPC layer supported that, it'd be a lot cleaner to simply 
not "open the front doors" while we're still coming up and recovering -- then 
all these races simply aren't possible.

> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
> YARN-4051.03.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New 
> state, when we finish application, the container still alive even after NM 
> event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-11-01 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984580#comment-14984580
 ] 

sandflee commented on YARN-4051:


Thanks Jason,  sorry for just noticed your reply. 

It's more reasonable to let others retry before nm recovered containers.
1, For AM stopContainer request ,  we could it simply like startContainers
2, For RM finish application or complete container request,  let RM retry, 
seems a little complicated,should we do that?

> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
> YARN-4051.03.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New 
> state, when we finish application, the container still alive even after NM 
> event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-10-02 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941233#comment-14941233
 ] 

Jason Lowe commented on YARN-4051:
--

Thanks for the patch!  Sorry for the delay, as I missed this when it was 
originally filed.

I'm lukewarm on an event buffering approach since we have to track it and 
remember to propagate it at all the appropriate times which is a maintenance 
burden.  Would it be simpler if we simply prevented the kill request from 
coming in too soon?  Seems like another way to fix this would be to prevent 
kill requests from arriving before we're done recovering containers.  We could 
do a similar "try again" response as we do for container start requests while 
still recovering, and we can postpone finish application processing until after 
containers are recovered.

However we decide to fix this, there should be a unit test to cover the 
scenario.

> ContainerKillEvent is lost when container is  In New State and is recovering
> 
>
> Key: YARN-4051
> URL: https://issues.apache.org/jira/browse/YARN-4051
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: sandflee
>Assignee: sandflee
>Priority: Critical
> Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
> YARN-4051.03.patch
>
>
> As in YARN-4050, NM event dispatcher is blocked, and container is in New 
> state, when we finish application, the container still alive even after NM 
> event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-08-25 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710716#comment-14710716
 ] 

sandflee commented on YARN-4051:


could anyone help to review it?

 ContainerKillEvent is lost when container is  In New State and is recovering
 

 Key: YARN-4051
 URL: https://issues.apache.org/jira/browse/YARN-4051
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: sandflee
Assignee: sandflee
Priority: Critical
 Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
 YARN-4051.03.patch


 As in YARN-4050, NM event dispatcher is blocked, and container is in New 
 state, when we finish application, the container still alive even after NM 
 event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-08-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699814#comment-14699814
 ] 

Hadoop QA commented on YARN-4051:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 26s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 57s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 59s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 37s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 21s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 13s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 20s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  44m 54s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750841/YARN-4051.03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 13604bd |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8865/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8865/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8865/console |


This message was automatically generated.

 ContainerKillEvent is lost when container is  In New State and is recovering
 

 Key: YARN-4051
 URL: https://issues.apache.org/jira/browse/YARN-4051
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: sandflee
Assignee: sandflee
Priority: Critical
 Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
 YARN-4051.03.patch


 As in YARN-4050, NM event dispatcher is blocked, and container is in New 
 state, when we finish application, the container still alive even after NM 
 event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-08-17 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699745#comment-14699745
 ] 

sandflee commented on YARN-4051:


if recovered as REQUESTED, try to cleanup container resource, and goto Done 
state.


 ContainerKillEvent is lost when container is  In New State and is recovering
 

 Key: YARN-4051
 URL: https://issues.apache.org/jira/browse/YARN-4051
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: sandflee
Assignee: sandflee
Priority: Critical
 Attachments: YARN-4051.01.patch, YARN-4051.02.patch, 
 YARN-4051.03.patch


 As in YARN-4050, NM event dispatcher is blocked, and container is in New 
 state, when we finish application, the container still alive even after NM 
 event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-08-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698213#comment-14698213
 ] 

Hadoop QA commented on YARN-4051:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 56s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 40s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 35s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 22s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 14s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 16s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  43m 38s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750647/YARN-4051.02.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 8dfec7a |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8849/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8849/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8849/console |


This message was automatically generated.

 ContainerKillEvent is lost when container is  In New State and is recovering
 

 Key: YARN-4051
 URL: https://issues.apache.org/jira/browse/YARN-4051
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: sandflee
Assignee: sandflee
Priority: Critical
 Attachments: YARN-4051.01.patch, YARN-4051.02.patch


 As in YARN-4050, NM event dispatcher is blocked, and container is in New 
 state, when we finish application, the container still alive even after NM 
 event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2015-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695564#comment-14695564
 ] 

Hadoop QA commented on YARN-4051:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 11s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 44s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 48s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 38s | The applied patch generated  5 
new checkstyle issues (total was 96, now 101). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 18s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 14s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m  7s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  43m 58s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750297/YARN-4051.01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 53bef9c |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8842/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8842/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8842/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8842/console |


This message was automatically generated.

 ContainerKillEvent is lost when container is  In New State and is recovering
 

 Key: YARN-4051
 URL: https://issues.apache.org/jira/browse/YARN-4051
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: sandflee
Assignee: sandflee
 Attachments: YARN-4051.01.patch


 As in YARN-4050, NM event dispatcher is blocked, and container is in New 
 state, when we finish application, the container still alive even after NM 
 event dispatcher is unblocked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)