[jira] [Commented] (YARN-3576) In Log - Container getting killed by AM even when work preserving is enabled

2015-05-06 Thread Anushri (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530046#comment-14530046
 ] 

Anushri commented on YARN-3576:
---

job running - pi
no of mappers - 10

in AM logs it's mentioned - Container killed on request

 In Log - Container getting killed by AM even when work preserving is enabled 
 -

 Key: YARN-3576
 URL: https://issues.apache.org/jira/browse/YARN-3576
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: SUSE11 SP3
 3 nodes cluster
Reporter: Anushri
Priority: Minor

 RM in HA mode
 NM running on one node
 work preserving enabled
 RM in HA mode one NM running work preserving is enabled An application is 
 submitted and RM switch over happens. In the NM log it is found that AM kills 
 some of the containers and those containers have exit code as 143. but in the 
 container logs , logs are found for the same container. 
 Problem : 
 if work preserving is enabled why is it killing and cleaning the container? 
 and if the container is getting killed , why is its log present in container 
 logs?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3576) In Log - Container getting killed by AM even when work preserving is enabled

2015-05-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14528437#comment-14528437
 ] 

Jason Lowe commented on YARN-3576:
--

bq. In the NM log it is found that AM kills some of the containers
If it appears the AM is killing the containers, have you looked in the AM log 
to see why it is choosing to do so?

bq. if the container is getting killed , why is its log present in container 
logs?
Logs can be present for containers regardless of their exit status.  It is 
perfectly normal and expected that a container can have logs if the container 
was killed after it was launched.

 In Log - Container getting killed by AM even when work preserving is enabled 
 -

 Key: YARN-3576
 URL: https://issues.apache.org/jira/browse/YARN-3576
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: SUSE11 SP3
 3 nodes cluster
Reporter: Anushri
Priority: Minor

 RM in HA mode
 NM running on one node
 work preserving enabled
 RM in HA mode one NM running work preserving is enabled An application is 
 submitted and RM switch over happens. In the NM log it is found that AM kills 
 some of the containers and those containers have exit code as 143. but in the 
 container logs , logs are found for the same container. 
 Problem : 
 if work preserving is enabled why is it killing and cleaning the container? 
 and if the container is getting killed , why is its log present in container 
 logs?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3576) In Log - Container getting killed by AM even when work preserving is enabled

2015-05-05 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14528689#comment-14528689
 ] 

Rohith commented on YARN-3576:
--

What job and how many containers are running?  If AM is killing the container 
with 143 exit code indicates containers are gracefully finished when the RM is 
the process of transitioning. Does all the the containers running on the 
NodeManager killed  or few killed?


 In Log - Container getting killed by AM even when work preserving is enabled 
 -

 Key: YARN-3576
 URL: https://issues.apache.org/jira/browse/YARN-3576
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: SUSE11 SP3
 3 nodes cluster
Reporter: Anushri
Priority: Minor

 RM in HA mode
 NM running on one node
 work preserving enabled
 RM in HA mode one NM running work preserving is enabled An application is 
 submitted and RM switch over happens. In the NM log it is found that AM kills 
 some of the containers and those containers have exit code as 143. but in the 
 container logs , logs are found for the same container. 
 Problem : 
 if work preserving is enabled why is it killing and cleaning the container? 
 and if the container is getting killed , why is its log present in container 
 logs?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)