How are you submitting/running the job - via spark-submit or as a plain old
Java program?

If you are using spark-submit, you can control the memory setting via the
configuration parameter spark.executor.memory in spark-defaults.conf.

If you are running it as a Java program, use -Xmx to set the maximum heap
size.

On Thu, Feb 11, 2016 at 5:46 AM, Nirav Patel <npa...@xactlycorp.com> wrote:

> In Yarn we have following settings enabled so that job can use virtual
> memory to have a capacity beyond physical memory off course.
>
> <property>
>         <name>yarn.nodemanager.vmem-check-enabled</name>
>         <value>false</value>
> </property>
>
> <property>
>         <name>yarn.nodemanager.pmem-check-enabled</name>
>         <value>false</value>
> </property>
>
> vmem to pmem ration is 2:1. However spark doesn't seem to be able to
> utilize this vmem limits
> we are getting following heap space error which seemed to be contained
> within spark executor.
>
> 16/02/09 23:08:06 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED
> SIGNAL 15: SIGTERM
> 16/02/09 23:08:06 ERROR executor.Executor: Exception in task 4.0 in stage
> 7.6 (TID 22363)
> java.lang.OutOfMemoryError: Java heap space
> at java.util.IdentityHashMap.resize(IdentityHashMap.java:469)
> at java.util.IdentityHashMap.put(IdentityHashMap.java:445)
> at
> org.apache.spark.util.SizeEstimator$SearchState.enqueue(SizeEstimator.scala:159)
> at
> org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply(SizeEstimator.scala:203)
> at
> org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply(SizeEstimator.scala:202)
> at scala.collection.immutable.List.foreach(List.scala:318)
> at
> org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:202)
> at
> org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:186)
> at org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:54)
> at
> org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:78)
> at
> org.apache.spark.util.collection.SizeTracker$class.afterUpdate(SizeTracker.scala:70)
> at
> org.apache.spark.util.collection.SizeTrackingVector.$plus$eq(SizeTrackingVector.scala:31)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
>
>
>
> Yarn resource manager doesn't give any indication that whether container
> ran out of phycial or virtual memory limits.
>
> Also how to profile this container memory usage? We know our data is
> skewed so some of the executor will have large data (~2M RDD objects) to
> process. I used following as executorJavaOpts but it doesn't seem to work.
> -XX:-HeapDumpOnOutOfMemoryError -XX:OnOutOfMemoryError='kill -3 %p'
> -XX:HeapDumpPath=/opt/cores/spark
>
>
>
>
>
>
> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>
> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
> <https://twitter.com/Xactly>  [image: Facebook]
> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
> <http://www.youtube.com/xactlycorporation>

Reply via email to