[jira] [Commented] (YARN-7914) Fix exit code handling for short lived Docker containers

2018-02-12 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361567#comment-16361567
 ] 

Shane Kumpf commented on YARN-7914:
---

Thanks [~jlowe] for the commit and review. Thanks [~billie.rinaldi] for the 
additional confirmation.

> Fix exit code handling for short lived Docker containers
> 
>
> Key: YARN-7914
> URL: https://issues.apache.org/jira/browse/YARN-7914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: YARN-7914.001.patch
>
>
> Currently, if c-e is unable to obtain the PID for a short lived Docker 
> container, the exitcode will not be properly obtained via {{docker inspect.}} 
> This results in containers successfully completing when they should fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7914) Fix exit code handling for short lived Docker containers

2018-02-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361528#comment-16361528
 ] 

Hudson commented on YARN-7914:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13648 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13648/])
YARN-7914. Fix exit code handling for short lived Docker containers. (jlowe: 
rev 5a1db60ab1e8b28cd73367c69970513de88cf4dd)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c


> Fix exit code handling for short lived Docker containers
> 
>
> Key: YARN-7914
> URL: https://issues.apache.org/jira/browse/YARN-7914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: YARN-7914.001.patch
>
>
> Currently, if c-e is unable to obtain the PID for a short lived Docker 
> container, the exitcode will not be properly obtained via {{docker inspect.}} 
> This results in containers successfully completing when they should fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7914) Fix exit code handling for short lived Docker containers

2018-02-12 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360985#comment-16360985
 ] 

Billie Rinaldi commented on YARN-7914:
--

+1, this patch solves the issue.

> Fix exit code handling for short lived Docker containers
> 
>
> Key: YARN-7914
> URL: https://issues.apache.org/jira/browse/YARN-7914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Critical
> Attachments: YARN-7914.001.patch
>
>
> Currently, if c-e is unable to obtain the PID for a short lived Docker 
> container, the exitcode will not be properly obtained via {{docker inspect.}} 
> This results in containers successfully completing when they should fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7914) Fix exit code handling for short lived Docker containers

2018-02-12 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360938#comment-16360938
 ] 

Jason Lowe commented on YARN-7914:
--

Thanks for the report and patch, Shane!

+1 lgtm.  I'll commit this later today if there are no objections.

> Fix exit code handling for short lived Docker containers
> 
>
> Key: YARN-7914
> URL: https://issues.apache.org/jira/browse/YARN-7914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Critical
> Attachments: YARN-7914.001.patch
>
>
> Currently, if c-e is unable to obtain the PID for a short lived Docker 
> container, the exitcode will not be properly obtained via {{docker inspect.}} 
> This results in containers successfully completing when they should fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7914) Fix exit code handling for short lived Docker containers

2018-02-12 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360515#comment-16360515
 ] 

Wangda Tan commented on YARN-7914:
--

This looks like a critical issue to me, boost its priority to critical.

> Fix exit code handling for short lived Docker containers
> 
>
> Key: YARN-7914
> URL: https://issues.apache.org/jira/browse/YARN-7914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Critical
> Attachments: YARN-7914.001.patch
>
>
> Currently, if c-e is unable to obtain the PID for a short lived Docker 
> container, the exitcode will not be properly obtained via {{docker inspect.}} 
> This results in containers successfully completing when they should fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7914) Fix exit code handling for short lived Docker containers

2018-02-09 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359102#comment-16359102
 ] 

Shane Kumpf commented on YARN-7914:
---

The unit test failure is unrelated, opened YARN-7917 to address it.

> Fix exit code handling for short lived Docker containers
> 
>
> Key: YARN-7914
> URL: https://issues.apache.org/jira/browse/YARN-7914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
> Attachments: YARN-7914.001.patch
>
>
> Currently, if c-e is unable to obtain the PID for a short lived Docker 
> container, the exitcode will not be properly obtained via {{docker inspect.}} 
> This results in containers successfully completing when they should fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7914) Fix exit code handling for short lived Docker containers

2018-02-09 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359061#comment-16359061
 ] 

genericqa commented on YARN-7914:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
29m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 10s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 76m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7914 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12909974/YARN-7914.001.patch |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 4040e207e1e0 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 543f3ab |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/19652/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19652/testReport/ |
| Max. process+thread count | 340 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19652/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fix exit code handling for short lived Docker containers
> 
>
> Key: YARN-7914
> URL: https://issues.apache.org/jira/browse/YARN-7914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>   

[jira] [Commented] (YARN-7914) Fix exit code handling for short lived Docker containers

2018-02-09 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358862#comment-16358862
 ] 

Shane Kumpf commented on YARN-7914:
---

Attaching a patch to fix the issue. The exit code handling logic incorrectly 
depended on obtaining a valid PID for the container. Even when a PID isn't 
available, we still need to get the exit code for the container to properly 
fail cases where the launch_command is invalid or simply calls exit with a 
non-zero value.

> Fix exit code handling for short lived Docker containers
> 
>
> Key: YARN-7914
> URL: https://issues.apache.org/jira/browse/YARN-7914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
> Attachments: YARN-7914.001.patch
>
>
> Currently, if c-e is unable to obtain the PID for a short lived Docker 
> container, the exitcode will not be properly obtained via {{docker inspect.}} 
> This results in containers successfully completing when they should fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7914) Fix exit code handling for short lived Docker containers

2018-02-09 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358774#comment-16358774
 ] 

Shane Kumpf commented on YARN-7914:
---

I know the issue here and will put up a patch soon.

> Fix exit code handling for short lived Docker containers
> 
>
> Key: YARN-7914
> URL: https://issues.apache.org/jira/browse/YARN-7914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Priority: Major
>
> Currently, if c-e is unable to obtain the PID for a short lived Docker 
> container, the exitcode will not be properly obtained via {{docker inspect.}} 
> This results in containers successfully completing successfully when they 
> should fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org