[ 
https://issues.apache.org/jira/browse/HADOOP-11364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14238716#comment-14238716
 ] 

Mohammad Kamrul Islam commented on HADOOP-11364:
------------------------------------------------

My findings and quick resolutions:
By default, Java 8 allocates extra virtual memory then Java 7. However, we can 
control the non-heap memory usage by limiting the maximum allowed values for 
some JVM  parameters  such as  "-XX:ReservedCodeCacheSize=100M 
-XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256"

For M/R based job (such as Pig, Hive etc), user can pass the following  JVM -XX 
parameters as part of mapreduce.reduce.java.opts or mapreduce.map.java.opts
{noformat}
mapreduce.reduce.java.opts  '-XX:ReservedCodeCacheSize=100M 
-XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256m -Xmx1536m -Xms512m 
-Djava.net.preferIPv4Stack=true'
{noformat}

Similarly for Spark job, we need to pass the same parameters in the Spark 
AM/master and executor. Spark community is working on the ways to pass these 
type of parameters easily. In Spark-1.1.0, user can pass it for spark-cluster 
based job submission as follows. For general job submission, user has to wait 
until https://issues.apache.org/jira/browse/SPARK-4461 is released.
{noformat}
spark.driver.extraJavaOptions = -XX:ReservedCodeCacheSize=100M 
-XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256m
{noformat}

For Spark executor, pass the following.
{noformat} 
spark.executor.extraJavaOptions = -XX:ReservedCodeCacheSize=100M 
-XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256m
{noformat}

 These parameters can be set in conf/spark-defaults.conf as well.

> [Java 8] Over usage of virtual memory
> -------------------------------------
>
>                 Key: HADOOP-11364
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11364
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Mohammad Kamrul Islam
>            Assignee: Mohammad Kamrul Islam
>
> In our Hadoop 2 + Java8 effort , we found few jobs are being Killed by Hadoop 
> due to excessive virtual memory allocation.  Although the physical memory 
> usage is low.
> The most common error message is "Container [pid=??,containerID=container_??] 
> is running beyond virtual memory limits. Current usage: 365.1 MB of 1 GB 
> physical memory used; 3.2 GB of 2.1 GB virtual memory used. Killing 
> container."
> We see this problem for MR job as well as in spark driver/executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to