[jira] [Commented] (YARN-3576) In Log - Container getting killed by AM even when work preserving is enabled
[ https://issues.apache.org/jira/browse/YARN-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530046#comment-14530046 ] Anushri commented on YARN-3576: --- job running - pi no of mappers - 10 in AM logs it's mentioned - Container killed on request In Log - Container getting killed by AM even when work preserving is enabled - Key: YARN-3576 URL: https://issues.apache.org/jira/browse/YARN-3576 Project: Hadoop YARN Issue Type: Bug Environment: SUSE11 SP3 3 nodes cluster Reporter: Anushri Priority: Minor RM in HA mode NM running on one node work preserving enabled RM in HA mode one NM running work preserving is enabled An application is submitted and RM switch over happens. In the NM log it is found that AM kills some of the containers and those containers have exit code as 143. but in the container logs , logs are found for the same container. Problem : if work preserving is enabled why is it killing and cleaning the container? and if the container is getting killed , why is its log present in container logs? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3576) In Log - Container getting killed by AM even when work preserving is enabled
[ https://issues.apache.org/jira/browse/YARN-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14528437#comment-14528437 ] Jason Lowe commented on YARN-3576: -- bq. In the NM log it is found that AM kills some of the containers If it appears the AM is killing the containers, have you looked in the AM log to see why it is choosing to do so? bq. if the container is getting killed , why is its log present in container logs? Logs can be present for containers regardless of their exit status. It is perfectly normal and expected that a container can have logs if the container was killed after it was launched. In Log - Container getting killed by AM even when work preserving is enabled - Key: YARN-3576 URL: https://issues.apache.org/jira/browse/YARN-3576 Project: Hadoop YARN Issue Type: Bug Environment: SUSE11 SP3 3 nodes cluster Reporter: Anushri Priority: Minor RM in HA mode NM running on one node work preserving enabled RM in HA mode one NM running work preserving is enabled An application is submitted and RM switch over happens. In the NM log it is found that AM kills some of the containers and those containers have exit code as 143. but in the container logs , logs are found for the same container. Problem : if work preserving is enabled why is it killing and cleaning the container? and if the container is getting killed , why is its log present in container logs? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3576) In Log - Container getting killed by AM even when work preserving is enabled
[ https://issues.apache.org/jira/browse/YARN-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14528689#comment-14528689 ] Rohith commented on YARN-3576: -- What job and how many containers are running? If AM is killing the container with 143 exit code indicates containers are gracefully finished when the RM is the process of transitioning. Does all the the containers running on the NodeManager killed or few killed? In Log - Container getting killed by AM even when work preserving is enabled - Key: YARN-3576 URL: https://issues.apache.org/jira/browse/YARN-3576 Project: Hadoop YARN Issue Type: Bug Environment: SUSE11 SP3 3 nodes cluster Reporter: Anushri Priority: Minor RM in HA mode NM running on one node work preserving enabled RM in HA mode one NM running work preserving is enabled An application is submitted and RM switch over happens. In the NM log it is found that AM kills some of the containers and those containers have exit code as 143. but in the container logs , logs are found for the same container. Problem : if work preserving is enabled why is it killing and cleaning the container? and if the container is getting killed , why is its log present in container logs? -- This message was sent by Atlassian JIRA (v6.3.4#6332)