[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615042#comment-16615042
 ] 

Hudson commented on YARN-8706:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14958 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14958/])
YARN-8706.  Allow additional flag in docker inspect call. (eyang: 
rev 99237607bf73e97b06eeb3455aa1327bfab4d5d2)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c


> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Fix For: 3.2.0
>
> Attachments: YARN-8706.001.patch, YARN-8706.002.patch, 
> YARN-8706.003.patch, YARN-8706.004.patch, YARN-8706.addendum.001.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-13 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614226#comment-16614226
 ] 

Chandni Singh commented on YARN-8706:
-

[~eyang] Thanks for looking at it. Yes I should not have replaced it. Was just 
a small change so overwrote the patch file. 

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Fix For: 3.2.0
>
> Attachments: YARN-8706.001.patch, YARN-8706.002.patch, 
> YARN-8706.003.patch, YARN-8706.004.patch, YARN-8706.addendum.001.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-13 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614215#comment-16614215
 ] 

Eric Yang commented on YARN-8706:
-

[~csingh] +1 for addendum patch 001.  It would be nice to not replace the patch 
after it has been posted.  It helps reviewer to know which version of the patch 
had been reviewed.  I will commit tomorrow, if no other issue has been found.

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Fix For: 3.2.0
>
> Attachments: YARN-8706.001.patch, YARN-8706.002.patch, 
> YARN-8706.003.patch, YARN-8706.004.patch, YARN-8706.addendum.001.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-07 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607866#comment-16607866
 ] 

Chandni Singh commented on YARN-8706:
-

Thanks [~eyang], [~ebadger], and [~shaneku...@gmail.com] 

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Fix For: 3.2.0
>
> Attachments: YARN-8706.001.patch, YARN-8706.002.patch, 
> YARN-8706.003.patch, YARN-8706.004.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-07 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607841#comment-16607841
 ] 

Hudson commented on YARN-8706:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14907 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14907/])
YARN-8706. Updated docker container stop logic to avoid double kill. 
(eyang: rev bf8a1750e99cfbfa76021ce51b6514c74c06f498)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/TestDockerContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/docker/DockerCommandExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/docker/DockerInspectCommand.java


> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Fix For: 3.2.0
>
> Attachments: YARN-8706.001.patch, YARN-8706.002.patch, 
> YARN-8706.003.patch, YARN-8706.004.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-07 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607809#comment-16607809
 ] 

Eric Yang commented on YARN-8706:
-

+1 for patch 004.  I will commit shortly.

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Attachments: YARN-8706.001.patch, YARN-8706.002.patch, 
> YARN-8706.003.patch, YARN-8706.004.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-05 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604930#comment-16604930
 ] 

Hadoop QA commented on YARN-8706:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 28s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
49s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
33s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
45s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 99m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8706 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12938524/YARN-8706.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 6cae489c93dc 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9af96d4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21772/testReport/ |
| Max. process+thread count | 302 (vs. ulimit of 1) |
| 

[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-05 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604800#comment-16604800
 ] 

Chandni Singh commented on YARN-8706:
-

I had deprecated {{DockerStopCommand}} because of which there are more 
deprecation warnings.

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Attachments: YARN-8706.001.patch, YARN-8706.002.patch, 
> YARN-8706.003.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-05 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604796#comment-16604796
 ] 

Hadoop QA commented on YARN-8706:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 52s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  7m  8s{color} 
| {color:red} hadoop-yarn-project_hadoop-yarn generated 6 new + 108 unchanged - 
0 fixed = 114 total (was 108) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
43s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
32s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}105m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8706 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12938505/YARN-8706.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b6f01d6270ba 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e780556 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| javac | 

[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-05 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604745#comment-16604745
 ] 

Eric Badger commented on YARN-8706:
---

Thanks for the update, [~csingh]. +1 (non-binding) pending Hadoop QA

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Attachments: YARN-8706.001.patch, YARN-8706.002.patch, 
> YARN-8706.003.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603773#comment-16603773
 ] 

Hadoop QA commented on YARN-8706:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  4m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 10s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
29s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  7m 29s{color} 
| {color:red} hadoop-yarn-project_hadoop-yarn generated 2 new + 108 unchanged - 
0 fixed = 110 total (was 108) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 14s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 236 unchanged - 0 fixed = 237 total (was 236) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
39s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
45s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
52s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 94m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8706 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12938369/YARN-8706.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b631c086fb35 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9964e33 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | 

[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-04 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603613#comment-16603613
 ] 

Eric Badger commented on YARN-8706:
---

{quote}
Either we implement native code that accepts name of the signal or just use 
docker kill.
I would prefer just going with docker kill at this point. Let me know your 
thoughts?
{quote}
Ah darn. Yea, I guess it's ok to go with the {{docker kill}} route for now. I 
think it would be good to file a followup JIRA to look in the future at 
removing this call, though. I've seen docker get into bad situations with low 
memory, and these extra container-executor invocations exacerbate the problem.

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Attachments: YARN-8706.001.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-09-04 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603480#comment-16603480
 ] 

Chandni Singh commented on YARN-8706:
-

{quote}Additionally, for non-privileged containers, we don't need to call 
docker kill. Instead, we can follow the code in handleContainerKill() and send 
the signal directly. I think this code could probably be combined, since at 
this point handleContainerKill() and handleContainerStop() will be doing the 
same thing. The only difference is that the STOPSIGNAL will be used for the 
stop.
{quote}
[~ebadger], it seems to me that getting the number value of the user specified 
{{STOPSIGNAL}}  is not that straightforward. 

Currently, we need the number of the signal to send it directly.
 {code}
 PrivilegedOperation privOp = new PrivilegedOperation(
  PrivilegedOperation.OperationType.SIGNAL_CONTAINER);
  privOp.appendArgs(ctx.getExecutionAttribute(RUN_AS_USER),
  ctx.getExecutionAttribute(USER),
  Integer.toString(PrivilegedOperation.RunAsUserCommand
  .SIGNAL_CONTAINER.getValue()),
  ctx.getExecutionAttribute(PID),
  Integer.toString(ctx.getExecutionAttribute(SIGNAL).getValue()));
{code}
Either we implement native code that accepts name of the signal or just use 
docker kill.
I would prefer just going with {{docker kill}} at this point. Let me know your 
thoughts?

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Attachments: YARN-8706.001.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-31 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599280#comment-16599280
 ] 

Chandni Singh commented on YARN-8706:
-

 {quote}
Additionally, for non-privileged containers, we don't need to call docker kill. 
Instead, we can follow the code in handleContainerKill() and send the signal 
directly. I think this code could probably be combined, since at this point 
handleContainerKill() and handleContainerStop() will be doing the same thing. 
The only difference is that the STOPSIGNAL will be used for the stop.
{quote}
Facing a challenge with this. With container stop, the stop signal is from the 
image. We may not have a {{ContainerExecutor.Signal}} enum corresponding to 
that signal and so we don't know the value of the signal which it should append 
to the privilege operation.
{code}
  PrivilegedOperation privOp = new PrivilegedOperation(
  PrivilegedOperation.OperationType.SIGNAL_CONTAINER);
  privOp.appendArgs(ctx.getExecutionAttribute(RUN_AS_USER),
  ctx.getExecutionAttribute(USER),
  Integer.toString(PrivilegedOperation.RunAsUserCommand
  .SIGNAL_CONTAINER.getValue()),
  ctx.getExecutionAttribute(PID),
  Integer.toString(ctx.getExecutionAttribute(SIGNAL).getValue()));
{code}
any ideas?

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Attachments: YARN-8706.001.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-31 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599088#comment-16599088
 ] 

Eric Badger commented on YARN-8706:
---

bq. In DockerLinuxContainerRuntime, reapContainer() calls 
handleContainerRemove() which uses docker inspect to just get the status. I 
don't think it is executing docker inspect multiple times.

Ah sorry, I was just looking at the .patch file, where it shows the first 
{{executeDockerInspect()}} inside fo the function {{reapContainer()}}, while it 
actually is called inside {{getIpAndHost()}}. 

bq. Yes, I execute docker inspect to get the status and stopsignal together in 
container stop. So now I have the status as a string which I need to convert to 
DockerContainerStatus. The refactoring was necessary in order to avoid 
duplication of the code.
Makes sense. Avoiding code duplication is a good thing

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Attachments: YARN-8706.001.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-31 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598976#comment-16598976
 ] 

Chandni Singh commented on YARN-8706:
-

Thanks for the review [~ebadger]
{quote}Both reapContainer() and handleContainerStop() call 
executeDockerInspect(). Since this spawns up a new container-executor process 
every time to do the inspect, it would be nice if we didn't have to make the 
call twice.
{quote}
In {{DockerLinuxContainerRuntime}}, {{reapContainer()}} calls 
{{handleContainerRemove()}} which uses docker inspect to just get the status. I 
don't think it is executing docker inspect multiple times.

{{handleContainerStop()}} is only called from {{signalContainer()}}.
{quote}Are the changes in DockerCommandExecutor necessary? They look like code 
refactoring that isn't relevant to this specific patch.
{quote}
Yes, I execute docker inspect to get the status and stopsignal together in 
container stop. So now I have the status as a {{string}} which I need to 
convert to {{DockerContainerStatus}}. The refactoring was necessary in order to 
avoid duplication of the code.
{quote}Instead, we can follow the code in handleContainerKill() and send the 
signal directly. I think this code could probably be combined, since at this 
point handleContainerKill() and handleContainerStop() will be doing the same 
thing.
{quote}
Ok. I will change this.

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Attachments: YARN-8706.001.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-31 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598932#comment-16598932
 ] 

Eric Badger commented on YARN-8706:
---

Thanks for the patch, [~csingh]!

Both {{reapContainer()}} and {{handleContainerStop()}} call 
{{executeDockerInspect()}}. Since this spawns up a new container-executor 
process every time to do the inspect, it would be nice if we didn't have to 
make the call twice. 

Are the changes in DockerCommandExecutor necessary? They look like code 
refactoring that isn't relevant to this specific patch.

Additionally, for non-privileged containers, we don't need to call {{docker 
kill}}. Instead, we can follow the code in {{handleContainerKill()}} and send 
the signal directly. I think this code could probably be combined, since at 
this point {{handleContainerKill()}} and {{handleContainerStop()}} will be 
doing the same thing. The only difference is that the STOPSIGNAL will be used 
for the stop.

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
> Attachments: YARN-8706.001.patch
>
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-29 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596887#comment-16596887
 ] 

Eric Badger commented on YARN-8706:
---

I think the default sleep delay between STOPSIGNAL and SIGKILL is irrelevant to 
this specific JIRA. It's certainly something that we can discuss and maybe it 
is reasonable to increase the value, but I don't think that it has anything to 
do with how we send the signals. That discussion is a separate issue. I am in 
still in favor of my original proposal

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-29 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596883#comment-16596883
 ] 

Eric Yang commented on YARN-8706:
-

{quote}Why is this specific to docker containers? Other types of containers 
maybe dealing with data and if the default grace period of 250 millis is too 
small, then it can be changed with the config NM_SLEEP_DELAY_BEFORE_SIGKILL_MS. 
Maybe this should be something that the application could specify as well, but 
that is a different discussion.{quote}

YARN containers were mostly stateless, and not reused.  The short termination 
wait time can work without causing problem to Hadoop specific application.  
With introduction of Docker container, it might take several seconds to 
gracefully shutdown a database daemon.  10 seconds default seems like a safer 
wait time if docker container is persisted and reused.  There isn't much data 
point to show waiting longer is better at this time.  The default setting may 
be revisited later.

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-29 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596637#comment-16596637
 ] 

Chandni Singh commented on YARN-8706:
-

{quote}I am not entirely sure about globally identical killing mechanism for 
all container type, is a sane approach to brute force container shutdown.
{quote}
I am not sure what you mean. NM does a graceful shutdown for all types of 
containers. It first sends a {{SIGTERM}} and then after a grace period, sends 
{{SIGKILL}}. 
The {{SIGTERM}} for docker is handled by docker stop, which has the following 
problems:
1. grace period can be specified only in seconds
2. clubs {{SIGKILL}} with stop. Docker first sends a {{STOPSIGNAL}} to the root 
process and then after the grace period, sends {{SIGKILL}} to the root process. 
This is not what NM wants with the stop and docker stop doesn't give any option 
to NOT send {{SIGKILL}}
The proposed change by [~ebadger] will just send the {{STOPSIGNAL}} which 
solves our problem.
{quote}10 seconds default is probably more sensible to give the container a 
chance to shutdown gracefully without causing corruption to data.
{quote}
Why is this specific to docker containers? Other types of containers maybe 
dealing with data and if the default grace period of 250 millis is too small, 
then it can be changed with the config {{NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}}. 
Maybe this should be something that the application could specify as well, but 
that is a different discussion.

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-28 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595851#comment-16595851
 ] 

Eric Yang commented on YARN-8706:
-

{quote}But then we have redundant configs for no reason. And we would be 
favoring the grace period config which only gives the granularity of seconds. 
It also means that we have different treatment of a container getting killed 
between runtimes, when we have the opportunity to maintain the same overall 
design (except for privileged containers){quote}

I agree that the config are redundant and it is probably preferred to remove 
NM_DOCKER_STOP_GRACE_PERIOD.  I am not entirely sure about globally identical 
killing mechanism for all container type, is a sane approach to brute force 
container shutdown.  There might be hidden danger that has not been discovered. 
 10 seconds default is probably more sensible to give container a chance to 
shutdown gracefully without causing corruption to data.

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-28 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595623#comment-16595623
 ] 

Eric Badger commented on YARN-8706:
---

bq. If this is setup properly, code only needs to ensure 
NM_SLEEP_DELAY_BEFORE_SIGKILL_MS is greater than NM_DOCKER_STOP_GRACE_PERIOD to 
prevent the double killing. Thoughts?
But then we have redundant configs for no reason. And we would be favoring the 
grace period config which only gives the granularity of seconds. It also means 
that we have different treatment of a container getting killed between 
runtimes, when we have the opportunity to maintain the same overall design 
(except for privileged containers)

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-28 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595612#comment-16595612
 ] 

Eric Yang commented on YARN-8706:
-

[~csingh] We can arrange it as NM_SLEEP_DELAY_BEFORE_SIGKILL_MS to be greater 
value than NM_DOCKER_STOP_GRACE_PERIOD.  Docker stop -t flag can honor 
NM_DOCKER_STOP_GRACE_PERIOD, and NM_SLEEP_DELAY_BEFORE_SIGKILL_MS will be 
enforced after NM_DOCKER_STOP_GRACE_PERIOD expires for catch all lingering 
processes?

If this is setup properly, code only needs to ensure 
NM_SLEEP_DELAY_BEFORE_SIGKILL_MS is greater than NM_DOCKER_STOP_GRACE_PERIOD to 
prevent the double killing.  Thoughts?

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-28 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595603#comment-16595603
 ] 

Chandni Singh commented on YARN-8706:
-

{quote}
Docker stop already covers sending the custom signal, and also 10 second grace 
period, then SIGKILL. I think it would be safe to skip DelayProcessKiller for 
docker containers. This seems to go back the sticking points that we discussed 
in YARN-8206 that we let Docker improve internally, or we rebuild every micro 
operation that docker performs in YARN. Either approach can work, but allowing 
run time specific API to handle this seems to be better to prevent the same 
work done twice.
{quote}
[~eyang] I want to highlight some issues with this approach:
- NM has a setting {{NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}} that promises to kill 
the containers (irrespective of their types) after this delay. This setting is 
in milliseconds and docker stop takes only seconds as arguments. This creates 
discrepancy in the grace period to be exact as what the user specified with 
{{NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}}. This is assuming that we will deprecate 
{{NM_DOCKER_STOP_GRACE_PERIOD}} as it is redundant.

- From NM's perspective, regardless of the runtime, it needs to kill the 
process after the period specified in millis which I think is correct because 
since we are already in the process of integrating additional runtimes and NM 
needs to guarantee the container will definitely be killed and therefore, the 
resources used by the container will be released. 



> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-28 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595585#comment-16595585
 ] 

Eric Yang commented on YARN-8706:
-

[~ebadger] suggested solution of discover STOPSIGNAL and perform this manually 
is same as calling docker stop.  I verified that STOPSIGNAL works on Docker 
1.12.6 and 17.06.0-ce.

Docker stop already covers sending the custom signal, and also 10 second grace 
period, then SIGKILL.  I think it would be safe to skip DelayProcessKiller for 
docker containers.  This seems to go back the sticking points that we discussed 
in YARN-8206 that we let Docker improve internally, or we rebuild every micro 
operation that docker performs in YARN.  Either approach can work, but allowing 
run time specific API to handle this seems to be better to prevent the same 
work done twice.

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-28 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595562#comment-16595562
 ] 

Shane Kumpf commented on YARN-8706:
---

Seems like a reasonable solution to me. {{docker stop}} has been a pain point, 
so removing that call while still supporting STOPSIGNAL sounds like what we 
want.

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-28 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595535#comment-16595535
 ] 

Eric Badger commented on YARN-8706:
---

bq. I can work on it, if there aren't any concerns?
Sounds good to me. The refactor might be slightly tricky/tedious, but I think 
we will be able to leverage this to make the code a lot leaner while also 
improving the lifecycle

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-28 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595515#comment-16595515
 ] 

Chandni Singh commented on YARN-8706:
-

Thanks [~shaneku...@gmail.com] and [~ebadger] for the clarification.

I guess then the solution proposed by [~ebadger] is what we need
{quote}I'm wondering if it's a reasonable solution to do a normal kill in the 
stop case (the SIGTERM case) and just look up the the STOPSIGNAL for the 
container using a docker inspect command. We might be able to leverage the 
docker inspect command that gets executed via getContainerStatus, even though 
that would require some refactoring and extra parsing of a generalized inspect. 
But that way we would be able to send a normal kill in both the stop and kill 
cases.
{quote}
I can work on it, if there aren't any concerns?

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-28 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595507#comment-16595507
 ] 

Shane Kumpf commented on YARN-8706:
---

Thanks for reporting this, [~csingh]. I know several of us discussed this in 
the past and ran into some sticking points.

As [~ebadger] points out, the reason for using {{docker stop}} is to be able to 
leverage the STOPSIGNAL directive that can be used in Dockerfiles. {{docker 
stop}} will issue the signal defined in the STOPSIGNAL instead of SIGTERM. This 
is important for gracefully stopping databases and even systemd (which expects 
SIGRTMIN+3).

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-28 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595410#comment-16595410
 ] 

Chandni Singh commented on YARN-8706:
-

{quote}
Really it would be better if we didn't send the kill from docker stop at all. 
{quote}
In this case, instead of docker stop, for the SIGTERM case, we can use {{docker 
kill --signal="SIGTERM" }}
Thoughts?

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-28 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595170#comment-16595170
 ] 

Eric Badger commented on YARN-8706:
---

Really it would be better if we didn't send the kill from docker stop at all. 
The reason that we're using docker stop at all instead of signaling like we do 
in all other containers (implemented in YARN-8206) is so that we can honor 
STOPSIGNAL if it's set for the docker image. I'm wondering if it's a reasonable 
solution to do a normal kill in the stop case (the SIGTERM case) and just look 
up the the STOPSIGNAL for the container using a docker inspect command. We 
might be able to leverage the docker inspect command that gets executed via 
{{getContainerStatus}}, even though that would require some refactoring and 
extra parsing of a generalized inspect. But that way we would be able to send a 
normal kill in both the stop and kill cases. We would still need to do a docker 
kill instead of a regular kill for privileged containers, just like we do for 
the SIGKILL case today.

[~shaneku...@gmail.com], you implemented the docker life cycle changes. Any 
thoughts?

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-27 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594348#comment-16594348
 ] 

Chandni Singh commented on YARN-8706:
-

I can see 2 ways for addressing this:

Approach 1:
1. Deprecate {{YarnConfiguration.NM_DOCKER_STOP_GRACE_PERIOD}}. 
{{YarnConfiguration.NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}} will trigger container 
kill after the delay ms. 
2. Nothing else changes. By default, docker stop uses grace period of 10 
seconds and even if {{DelayedProcessKiller}} executes after this, it will check 
whether the process is in stoppable state.

This requires no code change except deprecating 
{{YarnConfiguration.NM_DOCKER_STOP_GRACE_PERIOD}}


Approach 2: 
1.  Deprecate {{YarnConfiguration.NM_DOCKER_STOP_GRACE_PERIOD}}. 
{{YarnConfiguration.NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}}
2. For Docker Runtime,  rely only on docker stop to calculate grace period in 
seconds from 
{{YarnConfiguration.NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}} 
3. {{DelayedProcessKiller}} is NOT executed for Docker Runtime but executed for 
the other runtimes.

This requires a lot of change:
1. {{YarnConfiguration.NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}}  needs to be passed 
to {{DockerLinuxContainerRuntime}}
2. {{DelayedProcessKiller}} should be executed for all runtimes except 
{{DockerLinuxContainerRuntime}}


NOTE: {{YarnConfiguration.NM_DOCKER_STOP_GRACE_PERIOD}} should be deprecated in 
both cases


> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8706) DelayedProcessKiller is executed for Docker containers even though docker stop sends a KILL signal after the specified grace period

2018-08-24 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592013#comment-16592013
 ] 

Chandni Singh commented on YARN-8706:
-

Seems that {{YarnConfiguration.NM_DOCKER_STOP_GRACE_PERIOD}} is not 
required.{{YarnConfiguration.NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}} should be 
sufficient to specify the docker stop grace period. 

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> ---
>
> Key: YARN-8706
> URL: https://issues.apache.org/jira/browse/YARN-8706
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org