Spark Application Master on Yarn client mode - Virtual memory limit

Nirav Patel Wed, 10 Feb 2016 16:17:45 -0800

In Yarn we have following settings enabled so that job can use virtual
memory to have a capacity beyond physical memory off course.


<property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
</property>

<property>
        <name>yarn.nodemanager.pmem-check-enabled</name>
        <value>false</value>
</property>

vmem to pmem ration is 2:1. However spark doesn't seem to be able to
utilize this vmem limits
we are getting following heap space error which seemed to be contained
within spark executor.

16/02/09 23:08:06 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED
SIGNAL 15: SIGTERM
16/02/09 23:08:06 ERROR executor.Executor: Exception in task 4.0 in stage
7.6 (TID 22363)
java.lang.OutOfMemoryError: Java heap space
at java.util.IdentityHashMap.resize(IdentityHashMap.java:469)
at java.util.IdentityHashMap.put(IdentityHashMap.java:445)
at
org.apache.spark.util.SizeEstimator$SearchState.enqueue(SizeEstimator.scala:159)
at
org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply(SizeEstimator.scala:203)
at
org.apache.spark.util.SizeEstimator$$anonfun$visitSingleObject$1.apply(SizeEstimator.scala:202)
at scala.collection.immutable.List.foreach(List.scala:318)
at
org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:202)
at
org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:186)
at org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:54)
at
org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:78)
at
org.apache.spark.util.collection.SizeTracker$class.afterUpdate(SizeTracker.scala:70)
at
org.apache.spark.util.collection.SizeTrackingVector.$plus$eq(SizeTrackingVector.scala:31)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:278)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:262)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)



Yarn resource manager doesn't give any indication that whether container
ran out of phycial or virtual memory limits.

Also how to profile this container memory usage? We know our data is skewed
so some of the executor will have large data (~2M RDD objects) to process.
I used following as executorJavaOpts but it doesn't seem to work.
-XX:-HeapDumpOnOutOfMemoryError -XX:OnOutOfMemoryError='kill -3 %p'
-XX:HeapDumpPath=/opt/cores/spark

-- 


[image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>

<https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn] 
<https://www.linkedin.com/company/xactly-corporation>  [image: Twitter] 
<https://twitter.com/Xactly>  [image: Facebook] 
<https://www.facebook.com/XactlyCorp>  [image: YouTube] 
<http://www.youtube.com/xactlycorporation>

Spark Application Master on Yarn client mode - Virtual memory limit

Reply via email to