[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357402#comment-14357402
 ] 

Hadoop QA commented on YARN-3080:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12703979/YARN-3080.patch
  against trunk revision 30c428a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6921//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6921//console

This message is automatically generated.

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch, YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-10 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356323#comment-14356323
 ] 

Abin Shahab commented on YARN-3080:
---

[~vinodkv], Do you think you can review this? I have several other patches 
which are dependent on this.
Thanks!

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-06 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350529#comment-14350529
 ] 

Abin Shahab commented on YARN-3080:
---

[~sidharta-s] Thanks for looking into this.
[~vvasudev], Do you have additional review comments?

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-05 Thread Sidharta Seethana (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14349507#comment-14349507
 ] 

Sidharta Seethana commented on YARN-3080:
-

[~vvasudev] ,

[~ashahab] is right - docker run ( *irrespective of whether we run in detached 
mode or not* ) talks to the docker daemon which in turn sets up the appropriate 
environment before launching the process. For better or worse, the 'contained' 
process is always a child of the docker daemon and has no relation (from a 
process tree perspective) to the process that executes 'docker run'.  

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-02 Thread Beckham007 (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342995#comment-14342995
 ] 

Beckham007 commented on YARN-3080:
--

We can use yarn containerId as docker container name. So, docker kill yarn 
containerId will correctly kill the docker container.


 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-02 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14343006#comment-14343006
 ] 

Varun Vasudev commented on YARN-3080:
-

[~beckham007] - we don't run kill -9 $PID, we run kill -9 -$PID. In your 
example, you should have run kill -9 -7188.

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-02 Thread Beckham007 (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14343000#comment-14343000
 ] 

Beckham007 commented on YARN-3080:
--

[~vvasudev] the kill -9  couldn't work in this situation.
{quote}
[gaia@c112 ~]$ ps -ef|grep 7188 
gaia  7188 16807  0 14:46 ?00:00:00 bash 
/data/gaia/data/yarn/local/usercache/gaia/appcache/application_1424999012322_0960/container_1424999012322_0960_01_02/docker_container_executor.sh
gaia  7190  7188  0 14:46 ?00:00:00 /usr/bin/docker run --rm --name 
container_1424999012322_0960_01_02 -e GAIA_HOST_IP=10.149.27.112 -e 
GAIA_API_SERVER=http://shpc-test.api.oa.com/api -e GAIA_CLUSTER_ID=shpc-test -e 
GAIA_QUEUE=root.gaia -e GAIA_APP_NAME=dev_gaia -e GAIA_INSTANCE_ID=1 -e 
GAIA_CONTAINER_ID=container_1424999012322_0960_01_02 --memory=256M 
--cpu-shares=1020 -v 
/data/gaia/logs/container-logs/application_1424999012322_0960/container_1424999012322_0960_01_02:/data/gaia/logs/container-logs/application_1424999012322_0960/container_1424999012322_0960_01_02
 -v 
/data/gaia/data/yarn/local/usercache/gaia/appcache/application_1424999012322_0960/container_1424999012322_0960_01_02:/data/gaia/data/yarn/local/usercache/gaia/appcache/application_1424999012322_0960/container_1424999012322_0960_01_02
 -P docker.oa.com:8080/library/dev_gaia_repo:v2 bash 
/data/gaia/data/yarn/local/usercache/gaia/appcache/application_1424999012322_0960/container_1424999012322_0960_01_02/launch_container.sh
gaia 26414 32596  0 18:10 pts/12   00:00:00 grep 7188
[gaia@c112 ~]$ kill -9 7188
[gaia@c112 ~]$ ps -ef|grep 7188
gaia 26709 32596  0 18:10 pts/12   00:00:00 grep 7188
{quote}

but the parent pid has changed to 1.
{quote}
[gaia@c112 ~]$ ps -ef|grep 7190
gaia  7190 1  0 14:46 ?00:00:00 /usr/bin/docker run --rm --name 
container_1424999012322_0960_01_02 -e GAIA_HOST_IP=10.149.27.112 -e 
GAIA_API_SERVER=http://shpc-test.api.oa.com/api -e GAIA_CLUSTER_ID=shpc-test -e 
GAIA_QUEUE=root.gaia -e GAIA_APP_NAME=dev_gaia -e GAIA_INSTANCE_ID=1 -e 
GAIA_CONTAINER_ID=container_1424999012322_0960_01_02 --memory=256M 
--cpu-shares=1020 -v 
/data/gaia/logs/container-logs/application_1424999012322_0960/container_1424999012322_0960_01_02:/data/gaia/logs/container-logs/application_1424999012322_0960/container_1424999012322_0960_01_02
 -v 
/data/gaia/data/yarn/local/usercache/gaia/appcache/application_1424999012322_0960/container_1424999012322_0960_01_02:/data/gaia/data/yarn/local/usercache/gaia/appcache/application_1424999012322_0960/container_1424999012322_0960_01_02
 -P docker.oa.com:8080/library/dev_gaia_repo:v2 bash 
/data/gaia/data/yarn/local/usercache/gaia/appcache/application_1424999012322_0960/container_1424999012322_0960_01_02/launch_container.sh
gaia 28687 32596  0 18:11 pts/12   00:00:00 grep 7190
{quote}
and the docker container still running
{quote}
[gaia@c112 ~]$ docker ps 
CONTAINER IDIMAGECOMMAND
CREATED STATUS  PORTS   
NAMES 
235a2dc20c56docker.oa.com:8080/library/dev_gaia_repo:v2  
/etc/rc.local bash3 hours ago Up 3 hours  
0.0.0.0:49861-36000/tcp
container_1424999012322_0960_01_02   
{quote}

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 

[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-02 Thread Beckham007 (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342994#comment-14342994
 ] 

Beckham007 commented on YARN-3080:
--

We can use yarn containerId as docker container name. So, docker kill yarn 
containerId will correctly kill the docker container.


 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-02 Thread Beckham007 (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342996#comment-14342996
 ] 

Beckham007 commented on YARN-3080:
--

We can use yarn containerId as docker container name. So, docker kill yarn 
containerId will correctly kill the docker container.


 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-02 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342938#comment-14342938
 ] 

Varun Vasudev commented on YARN-3080:
-

Can you post another example showing the process tree before you killed the 
bash session? Also instead of kill -9 9289, can you try kill -9 -9289(with 
the right pid instead of 9289)?

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-02 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14343403#comment-14343403
 ] 

Abin Shahab commented on YARN-3080:
---

[~vvasudev], Thanks again for looking into this.

I'm fine with the simpler implementation as long as it works. However, I'm 
convinced it does not kill the process:
{code}

root@10-10-10-101:~# bash -x ./bash_test.sh 
+ echo 11163
+ exec bash -x ./docker_launch.sh
+ docker run -itd ubuntu bash -c 'sleep infinity'
48df7021c1c2402e77069fad8c9fced6fd74dfc00fc3b6d67b2b4fac86585c86
root@10-10-10-101:~# docker ps
CONTAINER IDIMAGE   COMMANDCREATED  
   STATUS  PORTS   NAMES
48df7021c1c2ubuntu:14.04bash -c 'sleep infi   5 seconds ago
   Up 4 secondssilly_lumiere   
root@10-10-10-101:~# cat /tmp/pidfile 
11163
root@10-10-10-101:~# kill -9 -11163
-bash: kill: (-11163) - No such process
root@10-10-10-101:~# docker ps
CONTAINER IDIMAGE   COMMANDCREATED  
   STATUS  PORTS   NAMES
48df7021c1c2ubuntu:14.04bash -c 'sleep infi   27 seconds ago   
   Up 26 seconds   silly_lumiere   
root@10-10-10-101:~# docker inspect --format {{.State.Pid}} 48df7021c1c2
11171
root@10-10-10-101:~# pstree -ps 11171
init(1)---docker(6512)---sleep(11171)
root@10-10-10-101:~# kill -9 11171
root@10-10-10-101:~# docker ps
CONTAINER IDIMAGE   COMMAND CREATED 
STATUS  PORTS   NAMES

{code}

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-01 Thread Beckham007 (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342613#comment-14342613
 ] 

Beckham007 commented on YARN-3080:
--

Getting getting the actual PID from docker inspect is good, but it is too 
complex. I think nm should use 
We sloved this by the same way as DefaultContainerExecutor(The same as 
[~chenchun]), and it works well.

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-01 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342744#comment-14342744
 ] 

Abin Shahab commented on YARN-3080:
---

[~beckham007] Thanks for your comment.
So you're saying that if I send a signal to the pid of the session script(as 
DefaultContainerExecutor does), it will work on the process that docker is 
running, and potentially kill it?

Please help me clarify my understanding:
I am running the following steps:
First I create a file similar to the session script. It writes the pid of the 
session to a pidfile
{code}
$cat  bash_session_pid.sh EOF
 #!/bin/bash
 echo $$  /tmp/pidfile
 exec setsid bash -c 'docker run -itd ubuntu sleep infinity'
 EOF
{code}

I chmod and run this script which starts a docker container
{code}
$chmod a+x bash_session_pid.sh
$./bash_session_pid.sh
$docker ps 
1b8ee377e3d2ubuntu:14.04sleep infinity3 minutes ago   
Up 3 minutescranky_stallman
{code}

Now I cat the pid of the session, and it says the pid is 9281
{code}
$cat /tmp/pidfile
9281
{code}


As you've suggested, I send a kill signal to the pid, hoping that'd kill the 
container
{code}
$kill -9 9281
{code}
I check if the docker container is killed:
{code}
$docker ps
1b8ee377e3d2ubuntu:14.04sleep infinity6 minutes ago   
Up 6 minutescranky_stallman
{code}

Since your method did not kill the container, I get the pid of the process 
running under the container:
{code}
$docker inspect 1b8ee377e3d2
9289
{code}
I check the tree of this process:
{code}
$pstree -ps 9289
init(1)---docker(6512)---sleep(9289)
{code}

As I had expected, this process is a child of the docker daemon, and therefore, 
if it's killed, the container will be killed. Therefore, I send a kill signal 
to this pid:
{code}
$kill -9 9289
{code}

Now I verify if the container is alive:
{code}
$docker ps
{code} 
Container is dead. 
From what I understand, the session pid has no relation to the actual pid of 
the container, and therefore, sending it signal is meaningless.
Therefore, if that meaningless pid is in the pidfile, 
NodeManager/ResourceManager will not be able to send signal to containers as 
needed.
Please let me know where my understanding is mistaken, and I gladly will switch 
it to the simpler implementation.

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 

[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-01 Thread Beckham007 (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342760#comment-14342760
 ] 

Beckham007 commented on YARN-3080:
--

Sorry for mistaking.
The pidfile has two functions. 1. When NM restarting, it use the pid to check 
whether the container finished. 2. As [~chenchun] said, As for 
signalContainer, we can use docker kill --signal=SIGNAL containerId instead.

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-01 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342828#comment-14342828
 ] 

Abin Shahab commented on YARN-3080:
---

[~chengzhendong888], you are implying that the pidfile in future will never be 
used for anything else? Container Executors are required to provide a correct 
pidfile, and that's the api contract between them and the NodeManager. I don't 
see why DockerContainerExecutor should violate that contract.

Also, how would you derive a containerId from a pid in the NM? The pid that 
will be sent to signal container won't even be the correct pid(it will be the 
session script pid).



 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-01 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342861#comment-14342861
 ] 

Varun Vasudev commented on YARN-3080:
-

[~ashahab] sorry for the late response. The kill functionality uses the kill 
-9 -$PID form which sends the kill signal to the process as well as all the 
children, grandchildren, etc. There's a more detailed explanation here - 
http://stackoverflow.com/a/15139734. The code that generates the kill command 
is in Shell.java  - look for getSignalKillCommand.

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-03-01 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342876#comment-14342876
 ] 

Abin Shahab commented on YARN-3080:
---

I agree with that Varun. However, I'm not sure the process launched under
docker is a child of the sessionScript(see my example above). I can be
wrong though.

On Sun, Mar 1, 2015 at 11:09 PM, Varun Vasudev (JIRA) j...@apache.org



 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-02-28 Thread Chun Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341427#comment-14341427
 ] 

Chun Chen commented on YARN-3080:
-

[~ashahab], I think we can simply fix this by using the pid of the session 
script bash process instead since docker run will block until it exits. If 
docker container exits, the session script bash process will exit immediately. 
As for signalContainer, we can use docker kill --signal=SIGNAL containerId 
instead.

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-02-28 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341761#comment-14341761
 ] 

Abin Shahab commented on YARN-3080:
---

[~chenchun], Also, other threads in the NM may depend on the actual PID of the 
launched JVM, and I'm not going to preclude any future processes from depending 
upon this. The pidfile is supposed to be populated right after the container is 
launched, not after the process has finished. The DockerContainerExecutor will 
not alter these fundamental api and expectations of the NM from a container 
executor. Therefore, getting the actual PID from docker inspect is the most 
accurate and recommended way.


 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-02-28 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341759#comment-14341759
 ] 

Abin Shahab commented on YARN-3080:
---

And how will you derive the container is from pid? By querying all
containers in the NodeManager? That does not seem like a good option.



 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-02-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339483#comment-14339483
 ] 

Hadoop QA commented on YARN-3080:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12701213/YARN-3080.patch
  against trunk revision bfbf076.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6769//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6769//console

This message is automatically generated.

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, 
 YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-02-25 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337668#comment-14337668
 ] 

Abin Shahab commented on YARN-3080:
---

[~raviprakash], [~vinodkv], [~vvasudev] [~ywskycn]

Please review. 

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-02-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337641#comment-14337641
 ] 

Hadoop QA commented on YARN-3080:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12700921/YARN-3080.patch
  against trunk revision d140d76.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6749//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6749//console

This message is automatically generated.

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-02-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334234#comment-14334234
 ] 

Hadoop QA commented on YARN-3080:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12700301/YARN-3080.patch
  against trunk revision 16bd79e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 7 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6705//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6705//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6705//console

This message is automatically generated.

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007
Assignee: Abin Shahab
 Attachments: YARN-3080.patch


 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile

2015-01-21 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286513#comment-14286513
 ] 

Ravi Prakash commented on YARN-3080:


Marking as Major because this is an alpha feature

 The DockerContainerExecutor could not write the right pid to container pidFile
 --

 Key: YARN-3080
 URL: https://issues.apache.org/jira/browse/YARN-3080
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Beckham007

 The docker_container_executor_session.sh is like this:
 {quote}
 #!/usr/bin/env bash
 echo `/usr/bin/docker inspect --format {{.State.Pid}} 
 container_1421723685222_0008_01_02`  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
 /bin/mv -f 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp
  
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid
 /usr/bin/docker run --rm  --name container_1421723685222_0008_01_02 -e 
 GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e 
 GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e 
 GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e 
 GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M 
 --cpu-shares=1024 -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02
  -v 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02
  -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash 
 /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh
 {quote}
 The DockerContainerExecutor use docker inspect before docker run, so the 
 docker inspect couldn't get the right pid for the docker, signalContainer() 
 and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)