[
https://issues.apache.org/jira/browse/HADOOP-11364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14238716#comment-14238716
]
Mohammad Kamrul Islam commented on HADOOP-11364:
------------------------------------------------
My findings and quick resolutions:
By default, Java 8 allocates extra virtual memory then Java 7. However, we can
control the non-heap memory usage by limiting the maximum allowed values for
some JVM parameters such as "-XX:ReservedCodeCacheSize=100M
-XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256"
For M/R based job (such as Pig, Hive etc), user can pass the following JVM -XX
parameters as part of mapreduce.reduce.java.opts or mapreduce.map.java.opts
{noformat}
mapreduce.reduce.java.opts '-XX:ReservedCodeCacheSize=100M
-XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256m -Xmx1536m -Xms512m
-Djava.net.preferIPv4Stack=true'
{noformat}
Similarly for Spark job, we need to pass the same parameters in the Spark
AM/master and executor. Spark community is working on the ways to pass these
type of parameters easily. In Spark-1.1.0, user can pass it for spark-cluster
based job submission as follows. For general job submission, user has to wait
until https://issues.apache.org/jira/browse/SPARK-4461 is released.
{noformat}
spark.driver.extraJavaOptions = -XX:ReservedCodeCacheSize=100M
-XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256m
{noformat}
For Spark executor, pass the following.
{noformat}
spark.executor.extraJavaOptions = -XX:ReservedCodeCacheSize=100M
-XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256m
{noformat}
These parameters can be set in conf/spark-defaults.conf as well.
> [Java 8] Over usage of virtual memory
> -------------------------------------
>
> Key: HADOOP-11364
> URL: https://issues.apache.org/jira/browse/HADOOP-11364
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Mohammad Kamrul Islam
> Assignee: Mohammad Kamrul Islam
>
> In our Hadoop 2 + Java8 effort , we found few jobs are being Killed by Hadoop
> due to excessive virtual memory allocation. Although the physical memory
> usage is low.
> The most common error message is "Container [pid=??,containerID=container_??]
> is running beyond virtual memory limits. Current usage: 365.1 MB of 1 GB
> physical memory used; 3.2 GB of 2.1 GB virtual memory used. Killing
> container."
> We see this problem for MR job as well as in spark driver/executor.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)