[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818377#comment-16818377
 ] 

Hudson commented on YARN-7848:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16412 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16412/])
YARN-7848 Force removal of docker containers that do not get removed on 
(ebadger: rev 5583e1b6fcfd5651857ada7ed851f09fc19969bc)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test_docker_util.cc
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c


> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch, 
> YARN-7848.003.patch, YARN-7848.004.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-11 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815709#comment-16815709
 ] 

Eric Yang commented on YARN-7848:
-

[~Jim_Brennan] Thank you for tips.  [~ebadger] can you commit the patch, if it 
looks good to you?  Thanks

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch, 
> YARN-7848.003.patch, YARN-7848.004.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-11 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815434#comment-16815434
 ] 

Jim Brennan commented on YARN-7848:
---

Thanks for updating [~eyang]!  lgtm.   I am +1 on patch 004 (non-binding). 

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch, 
> YARN-7848.003.patch, YARN-7848.004.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814928#comment-16814928
 ] 

Hadoop QA commented on YARN-7848:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
30m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 53s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
35s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-7848 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12965521/YARN-7848.004.patch |
| Optional Tests |  dupname  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux f2c5d1dd9dc6 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 8740755 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23933/testReport/ |
| Max. process+thread count | 412 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23933/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch, 
> YARN-7848.003.patch, YARN-7848.004.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> 

[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-10 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814900#comment-16814900
 ] 

Eric Yang commented on YARN-7848:
-

[~Jim_Brennan] Thanks for the feedback.  Patch 004 clean up the code according 
to your comments.

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch, 
> YARN-7848.003.patch, YARN-7848.004.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-10 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814627#comment-16814627
 ] 

Jim Brennan commented on YARN-7848:
---

Thanks [~eyang] for the update!

Some comments:
 In get_docker_rm_command(), we are not setting the return code if the 
add_to_args() for {{"-f"}} fails., and it seems like it would be cleaner to 
structure it as:
{noformat}
ret = add_to_args(args, DOCKER_RM_COMMAND);
if (ret != 0) {
  ret = BUFFER_TOO_SMALL;
  goto free_and_exit;
}
ret = add_to_args(args, "-f");
if (ret != 0) {
  ret = BUFFER_TOO_SMALL;
  goto free_and_exit;
}
ret = add_to_args(args, container_name);
if (ret != 0) {
  ret = BUFFER_TOO_SMALL;
  goto free_and_exit;
}
{noformat}

 (nit) In remove_docker_container(), if you wanted to minimize the changes, you 
could have kept {{start_index}} and just set {{args[1] = argv[start_index];}} 
after the if.

In remove_docker_container(), I don't think it is appropriate to use 
free_values(args).  The values are on the stack, not the heap.  You do want to 
do a free(args) to free the array of pointers you allocated.
I think you need to do this free in both the child and the parent.



> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch, 
> YARN-7848.003.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-09 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813627#comment-16813627
 ] 

Eric Yang commented on YARN-7848:
-

[~ebadger] [~Jim_Brennan] The failed unit test is not related to patch 003.  
Please review.  thanks

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch, 
> YARN-7848.003.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-08 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812972#comment-16812972
 ] 

Hadoop QA commented on YARN-7848:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
30m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 19s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-7848 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12965253/YARN-7848.003.patch |
| Optional Tests |  dupname  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux c22e58a01bfe 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 69e3745 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/23917/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23917/testReport/ |
| Max. process+thread count | 447 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23917/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch, 
> 

[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-08 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812834#comment-16812834
 ] 

Eric Yang commented on YARN-7848:
-

[~ebadger] [~Jim_Brennan] Thanks for the extra review to make sure that we 
didn't code this into a no-op.  Patch 003 adds "-f" to both 
remove_docker_container and get_docker_rm_command.

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch, 
> YARN-7848.003.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-08 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812755#comment-16812755
 ] 

Eric Badger commented on YARN-7848:
---

Looked at the code with [~Jim_Brennan] and he's right. {{DockerRmCommand}} 
calls the container-executor with {{--remove-docker-container}}. That codepath 
goes through {{remove_docker_container}} in container-executor.c. To be 
consistent with the use of {{-f}} for docker rm, I believe we should change it 
in both places in the C code (i.e. in {{remove_docker_container}} and 
{{get_docker_rm_command}}).

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-08 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812744#comment-16812744
 ] 

Jim Brennan commented on YARN-7848:
---

[~eyang], [~ebadger], it looks to me like the code in patch 002 will not do 
what we want.  We are adding the "force", "true" arguments to the 
DockerRmCommand() and handling them in the --run-docker c-code.   But 
DockerRmCommand() is actually passing the --remove_docker_container command, 
which is not modified.  "force", "true" will be passed in to the docker rm 
command, and it won't know what to do with them.

 

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-08 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812730#comment-16812730
 ] 

Eric Badger commented on YARN-7848:
---

bq. Changing this can have dicey outcome and not backward compatible. I am 
likely to have to use cmd file serialization structure to pass the force option 
from Java to container-executor for safety reasons.

Yea that makes sense. I thought we were already using the .cmd file with 
remove-docker-container. Ok, given that reasoning I'm alright keeping the Java 
code as it is and making the code changes all in the container-executor. 

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-08 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812722#comment-16812722
 ] 

Eric Yang commented on YARN-7848:
-

{quote}Thanks for the update, Eric Yang! On the java side, it looks like 
DockerRmCommand is invoked in 2 different places. Could we maintain the 
original constructor with no force and then add an additional constructor with 
a force parameter? We could then change the 2 invocations to use the new 
constructor with setting force to true.{quote}

I can preserve existing constructor without force.  Passing optional parameter 
to container-executor for removing container is slightly dicey at this point.  
The accepting parameters are:

```container-executor --remove-docker-container [hierarchy] ```

What we want is:

```container-executor --remove-docker-container [hierarchy] [force] 
```

Changing this can have dicey outcome and not backward compatible.  I am likely 
to have to use cmd file serialization structure to pass the force option from 
Java to container-executor for safety reasons.

```container-executor --run-docker ```

This will leave a lot of debris and dead code that others may pick up and use 
it incorrectly.  I am not sure passing this flag from Java is good for long 
term maintenance.  Keeping it simple is probably better.  Thoughts [~ebadger] 
[~Jim_Brennan] [~shaneku...@gmail.com] [~billie.rinaldi]?

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-08 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812528#comment-16812528
 ] 

Eric Badger commented on YARN-7848:
---

Thanks for the update, [~eyang]! On the java side, it looks like 
{{DockerRmCommand}} is invoked in 2 different places. Could we maintain the 
original constructor with no force and then add an additional constructor with 
a {{force}} parameter? We could then change the 2 invocations to use the new 
constructor with setting {{force}} to {{true}}. 

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-05 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811375#comment-16811375
 ] 

Hadoop QA commented on YARN-7848:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
56s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 2 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 34s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.linux.runtime.docker.TestDockerRmCommand
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-7848 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12965038/YARN-7848.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 317b10b0565e 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e9b859f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/23902/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
 |
| unit | 

[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-05 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811188#comment-16811188
 ] 

Eric Yang commented on YARN-7848:
-

Patch 002 added the force option to cmd file, and invoke from Java side.

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-05 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810951#comment-16810951
 ] 

Eric Badger commented on YARN-7848:
---

We talked about the use of {{-f}} in our most recent docker meeting and all 
came to agreement that this was appropriate for all removals. [~eyang]'s 
explanation above does a good job summarizing why. 

I do have one ask for the patch though. The current patch changes 
{{get_docker_rm_command}} to always give it the {{-f}} flag. I think it would 
be beneficial to have this as an option passed into the function so that we 
don't accidentally reuse this function in the future and unknowingly use {{rm 
-f}} when we aren't expecting to



> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-03-27 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802833#comment-16802833
 ] 

Jim Brennan commented on YARN-7848:
---

I think I agree with [~eyang] on the use of {{-f}}.  By the time we are trying 
to remove the container, we have already tried to kill the process and stop the 
container, so I don't think there is any danger in using the -f option, and it 
may succeed in cases where it otherwise doesn't now.  I can't think of anything 
bad that would happen by using the force option every time in our use cases.

 

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-03-26 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802178#comment-16802178
 ] 

Eric Yang commented on YARN-7848:
-

[~ebadger] Docker rm will use stop signal then remove the container workspace 
and metadata.  Docker rm -f will use SIGKILL then remove the container 
workspace and metadata.  In YARN ContainerCleanup, it will send stop signal if 
pid exists.  If not, it will run docker kill, then follow by docker rm.  
Therefore, the steps leading to docker rm is identical to docker stop, docker 
rm -f.  There will be no process running after docker kill is ran.  The 
remaining task is to remove metadata from docker database.  Therefore, making 
distinction between first docker rm and docker rm -f become unnecessary because 
YARN already performed stop, and docker kill operations.

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-03-26 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801943#comment-16801943
 ] 

Eric Badger commented on YARN-7848:
---

Thanks for the patch, [~eyang]!

Correct me if I'm wrong, but the patch looks like it does a force removal on 
every removal. The idea of this JIRA is to only do the force removal if the 
first removal fails. 

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-03-26 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801892#comment-16801892
 ] 

Eric Yang commented on YARN-7848:
-

The failed unit test is not related to this patch.

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Zhaohui Xin
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-03-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801270#comment-16801270
 ] 

Hadoop QA commented on YARN-7848:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
31m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 59s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 49s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 69m 41s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-7848 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12963667/YARN-7848.001.patch |
| Optional Tests |  dupname  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux c48ac68a88d7 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3f6d6d2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/23807/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23807/testReport/ |
| Max. process+thread count | 311 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23807/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Zhaohui Xin
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch
>
>
> After 

[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-03-25 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801229#comment-16801229
 ] 

Eric Yang commented on YARN-7848:
-

Back track to original discussion that if container stuck in Created state, 
using -f flag is good enough to remove faulty container.  The background thread 
may keep system cleaner, but not everyone is comfortable with that idea at this 
time.  Therefore, focus on improvement to the current needs.

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Zhaohui Xin
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-03-21 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798427#comment-16798427
 ] 

Eric Yang commented on YARN-7848:
-

[~ebadger] {quote}Additionally, how would this work with debug-delay? If I want 
my image to stick around for awhile (or indefinitely) so that I can debug them, 
how will that co-exist with this periodic pruning?{quote}

I don't have good answers for system admin allowing image dump to stick around 
forever.  The clean up thread is optional and configurable.  The scheduling can 
be based on debug-delay to ensure image is being kept for debug delay window.  
This will only delete containers stuck in Created/Exited states after passing 
debug-delay window.





> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Zhaohui Xin
>Priority: Major
>  Labels: Docker
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-03-21 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798387#comment-16798387
 ] 

Eric Badger commented on YARN-7848:
---

I don't like this idea. This is the Nodemanager completely taking over docker. 
The NM should use docker, but I don't think it should be assumed that it is the 
only thing that can use docker. To me, this seems like something that should be 
handled at an ops level in a cron job, if they want to make sure all images are 
gone. I'm ok with having a more generalized case of pruning images 
periodically, but I'm not comfortable with the NM pruning containers that it 
didn't start.

Additionally, how would this work with debug-delay? If I want my image to stick 
around for awhile (or indefinitely) so that I can debug them, how will that 
co-exist with this periodic pruning? 

cc [~shaneku...@gmail.com]

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Zhaohui Xin
>Priority: Major
>  Labels: Docker
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-03-20 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797677#comment-16797677
 ] 

Eric Yang commented on YARN-7848:
-

There are many scenarios that containers may not create properly or shutdown 
properly due to problems with the images themselves.  Application-lifecycles 
doesn't cover all abnormal cases.  It looks like we need a background thread to 
run:

{code}
docker container prune --filter="until=15m" -a
{code}

Every 10 minutes.  This will ensure the dangling containers have max life time 
no older than 15 minutes.  

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Zhaohui Xin
>Priority: Major
>  Labels: Docker
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-01-24 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751820#comment-16751820
 ] 

Eric Yang commented on YARN-7848:
-

[~ebadger] YARN-9074 does not force removal of container.  Hence, this is still 
valid issue to be addressed.

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Zhaohui Xin
>Priority: Major
>  Labels: Docker
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-01-23 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750499#comment-16750499
 ] 

Eric Badger commented on YARN-7848:
---

Does YARN-9074 force the removal of the containers? The problem initially was 
that docker would sometimes get into a weird state where containers couldn't be 
deleted unless the {{-f}} flag was given to {{docker rm}}. So the original 
intention of this JIRA was to allow for a few retries without forcing removal 
and then doing the force if necessary

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Zhaohui Xin
>Priority: Major
>  Labels: Docker
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-01-23 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750385#comment-16750385
 ] 

Eric Yang commented on YARN-7848:
-

Do we still need this JIRA if YARN-9074 moved the docker container deletion 
into ContainerCleanup?

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Zhaohui Xin
>Priority: Major
>  Labels: Docker
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org