Guoqiang Li created SPARK-1930:
----------------------------------

             Summary: When the yarn containers occupies 8G memory ,the 
containers were killed
                 Key: SPARK-1930
                 URL: https://issues.apache.org/jira/browse/SPARK-1930
             Project: Spark
          Issue Type: Bug
          Components: YARN
            Reporter: Guoqiang Li


When the containers occupies 8G memory ,the containers were killed
yarn node manager log:
{code}
2014-05-23 13:00:23,856 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
 Container [pid=42542,containerID=container_1400809535638_0013_01_000015] is 
running beyond physical memory limits. Current usage: 8.5 GB of 8.5 GB physical 
memory used; 9.6 GB of 17.8 GB virtual memory used. Killing container.
Dump of the process-tree for container_1400809535638_0013_01_000015 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 42547 42542 42542 42542 (java) 1064464 42738 10247843840 2240623 
/usr/java/jdk1.7.0_45-cloudera/bin/java -server -XX:OnOutOfMemoryError=kill %p 
-Xms8192m -Xmx8192m -Xss2m 
-Djava.io.tmpdir=/yarn/nm/usercache/spark/appcache/application_1400809535638_0013/container_1400809535638_0013_01_000015/tmp
 -Dlog4j.configuration=log4j-spark-container.properties 
-Dspark.akka.askTimeout=120 -Dspark.akka.timeout=120 -Dspark.akka.frameSize=20 
org.apache.spark.executor.CoarseGrainedExecutorBackend 
akka.tcp://sp...@10dian71.domain.test:42766/user/CoarseGrainedScheduler 12 
10dian72.domain.test 4
        |- 42542 25417 42542 42542 (bash) 0 0 110804992 335 /bin/bash -c 
/usr/java/jdk1.7.0_45-cloudera/bin/java -server -XX:OnOutOfMemoryError='kill 
%p' -Xms8192m -Xmx8192m  -Xss2m 
-Djava.io.tmpdir=/yarn/nm/usercache/spark/appcache/application_1400809535638_0013/container_1400809535638_0013_01_000015/tmp
  -Dlog4j.configuration=log4j-spark-container.properties 
-Dspark.akka.askTimeout="120" -Dspark.akka.timeout="120" 
-Dspark.akka.frameSize="20" 
org.apache.spark.executor.CoarseGrainedExecutorBackend 
akka.tcp://sp...@10dian71.domain.test:42766/user/CoarseGrainedScheduler 12 
10dian72.domain.test 4 1> 
/var/log/hadoop-yarn/container/application_1400809535638_0013/container_1400809535638_0013_01_000015/stdout
 2> 
/var/log/hadoop-yarn/container/application_1400809535638_0013/container_1400809535638_0013_01_000015/stderr
{code}
I think it should be related with {{YarnAllocationHandler.MEMORY_OVERHEA}}  
https://github.com/apache/spark/blob/master/yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala#L562

Relative to 8G, 384 MB is too small



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to