how to set spark.executor.memory and heap size

wxhsdp Wed, 23 Apr 2014 20:22:27 -0700

hi
i'am testing SimpleApp.scala in standalone mode with only one pc, so i have
one master and one local worker on the same pc


with rather small input file size(4.5K), i have got the
java.lang.OutOfMemoryError: Java heap space error

here's my settings:
spark-env.sh:
export SPARK_MASTER_IP="127.0.0.1"
export SPARK_WORKER_CORES=1
export SPARK_WORKER_MEMORY=2g
export SPARK_JAVA_OPTS+=" -Xms512m -Xmx512m " //(1)

SimpleApp.scala:
    val conf = new SparkConf()
      .setMaster("spark://127.0.0.1:7077")
      .setAppName("Simple App")
      .set("spark.executor.memory", "1g")  //(2)
    val sc = new SparkContext(conf)

sbt:
SBT_OPTS="-Xms512M -Xmx512M" //(3)
java $SBT_OPTS -jar `dirname $0`/sbt-launch.jar "$@"

i'am confused with the above (1)(2)(3) settings, and tried several different
options, but all failed
with java.lang.OutOfMemoryError:(

what's the difference between JVM heap size and spark.executor.memory and
how to set them?

i've read some docs and still cannot fully understand

spark.executor.memory: Amount of memory to use per executor process, in the
same format as JVM memory strings (e.g. 512m, 2g).

spark.storage.memoryFraction: Fraction of Java heap to use for Spark's
memory cache.

spark.storage.memoryFraction = 0.6 * spark.executor.memory

is that mean spark.executor.memory = JVM heap size?

here's the logs:
[info] Running SimpleApp 
14/04/24 10:59:41 WARN util.Utils: Your hostname, ubuntu resolves to a
loopback address: 127.0.1.1; using 192.168.0.113 instead (on interface eth0)
14/04/24 10:59:41 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to
another address
14/04/24 10:59:42 INFO slf4j.Slf4jLogger: Slf4jLogger started
14/04/24 10:59:42 INFO Remoting: Starting remoting
14/04/24 10:59:42 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://spark@ubuntu.local:46864]
14/04/24 10:59:42 INFO Remoting: Remoting now listens on addresses:
[akka.tcp://spark@ubuntu.local:46864]
14/04/24 10:59:42 INFO spark.SparkEnv: Registering BlockManagerMaster
14/04/24 10:59:42 INFO storage.DiskBlockManager: Created local directory at
/tmp/spark-local-20140424105942-362c
14/04/24 10:59:42 INFO storage.MemoryStore: MemoryStore started with
capacity 297.0 MB.
14/04/24 10:59:42 INFO network.ConnectionManager: Bound socket to port 34146
with id = ConnectionManagerId(ubuntu.local,34146)
14/04/24 10:59:42 INFO storage.BlockManagerMaster: Trying to register
BlockManager
14/04/24 10:59:42 INFO storage.BlockManagerMasterActor$BlockManagerInfo:
Registering block manager ubuntu.local:34146 with 297.0 MB RAM
14/04/24 10:59:42 INFO storage.BlockManagerMaster: Registered BlockManager
14/04/24 10:59:43 INFO spark.HttpServer: Starting HTTP Server
14/04/24 10:59:43 INFO server.Server: jetty-7.6.8.v20121106
14/04/24 10:59:43 INFO server.AbstractConnector: Started
SocketConnector@0.0.0.0:58936
14/04/24 10:59:43 INFO broadcast.HttpBroadcast: Broadcast server started at
http://192.168.0.113:58936
14/04/24 10:59:43 INFO spark.SparkEnv: Registering MapOutputTracker
14/04/24 10:59:43 INFO spark.HttpFileServer: HTTP File server directory is
/tmp/spark-ce78fc2c-097d-4053-991d-b6bf140d6c33
14/04/24 10:59:43 INFO spark.HttpServer: Starting HTTP Server
14/04/24 10:59:43 INFO server.Server: jetty-7.6.8.v20121106
14/04/24 10:59:43 INFO server.AbstractConnector: Started
SocketConnector@0.0.0.0:56414
14/04/24 10:59:43 INFO server.Server: jetty-7.6.8.v20121106
14/04/24 10:59:43 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/storage/rdd,null}
14/04/24 10:59:43 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/storage,null}
14/04/24 10:59:43 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/stages/stage,null}
14/04/24 10:59:43 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/stages/pool,null}
14/04/24 10:59:43 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/stages,null}
14/04/24 10:59:43 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/environment,null}
14/04/24 10:59:43 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/executors,null}
14/04/24 10:59:43 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/metrics/json,null}
14/04/24 10:59:43 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/static,null}
14/04/24 10:59:43 INFO handler.ContextHandler: started
o.e.j.s.h.ContextHandler{/,null}
14/04/24 10:59:43 INFO server.AbstractConnector: Started
SelectChannelConnector@0.0.0.0:4040
14/04/24 10:59:43 INFO ui.SparkUI: Started Spark Web UI at
http://ubuntu.local:4040
14/04/24 10:59:43 INFO client.AppClient$ClientActor: Connecting to master
spark://127.0.0.1:7077...
14/04/24 10:59:44 INFO cluster.SparkDeploySchedulerBackend: Connected to
Spark cluster with app ID app-20140424105944-0001
14/04/24 10:59:44 INFO client.AppClient$ClientActor: Executor added:
app-20140424105944-0001/0 on worker-20140424105022-ubuntu.local-40058
(ubuntu.local:40058) with 1 cores
14/04/24 10:59:44 INFO cluster.SparkDeploySchedulerBackend: Granted executor
ID app-20140424105944-0001/0 on hostPort ubuntu.local:40058 with 1 cores,
1024.0 MB RAM
14/04/24 10:59:44 INFO client.AppClient$ClientActor: Executor updated:
app-20140424105944-0001/0 is now RUNNING
14/04/24 10:59:45 INFO storage.MemoryStore: ensureFreeSpace(32960) called
with curMem=0, maxMem=311387750
14/04/24 10:59:45 INFO storage.MemoryStore: Block broadcast_0 stored as
values to memory (estimated size 32.2 KB, free 296.9 MB)
14/04/24 10:59:45 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
14/04/24 10:59:45 WARN snappy.LoadSnappy: Snappy native library not loaded
14/04/24 10:59:45 INFO mapred.FileInputFormat: Total input paths to process
: 1
14/04/24 10:59:45 INFO spark.SparkContext: Starting job: count at
SimpleApp.scala:27
14/04/24 10:59:45 INFO scheduler.DAGScheduler: Got job 0 (count at
SimpleApp.scala:27) with 2 output partitions (allowLocal=false)
14/04/24 10:59:45 INFO scheduler.DAGScheduler: Final stage: Stage 0 (count
at SimpleApp.scala:27)
14/04/24 10:59:45 INFO scheduler.DAGScheduler: Parents of final stage:
List()
14/04/24 10:59:45 INFO scheduler.DAGScheduler: Missing parents: List()
14/04/24 10:59:45 INFO scheduler.DAGScheduler: Submitting Stage 0
(MappedRDD[1] at textFile at SimpleApp.scala:25), which has no missing
parents
14/04/24 10:59:45 INFO scheduler.DAGScheduler: Submitting 2 missing tasks
from Stage 0 (MappedRDD[1] at textFile at SimpleApp.scala:25)
14/04/24 10:59:45 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with
2 tasks
14/04/24 10:59:46 INFO cluster.SparkDeploySchedulerBackend: Registered
executor:
Actor[akka.tcp://sparkExecutor@ubuntu.local:41819/user/Executor#84992753]
with ID 0
14/04/24 10:59:47 INFO scheduler.TaskSetManager: Starting task 0.0:0 as TID
0 on executor 0: ubuntu.local (PROCESS_LOCAL)
14/04/24 10:59:47 INFO scheduler.TaskSetManager: Serialized task 0.0:0 as
1563 bytes in 17 ms
14/04/24 10:59:47 INFO storage.BlockManagerMasterActor$BlockManagerInfo:
Registering block manager ubuntu.local:60938 with 593.9 MB RAM
14/04/24 10:59:49 INFO scheduler.TaskSetManager: Starting task 0.0:1 as TID
1 on executor 0: ubuntu.local (PROCESS_LOCAL)
14/04/24 10:59:49 INFO scheduler.TaskSetManager: Serialized task 0.0:1 as
1563 bytes in 0 ms
14/04/24 10:59:49 WARN scheduler.TaskSetManager: Lost TID 0 (task 0.0:0)
14/04/24 10:59:49 WARN scheduler.TaskSetManager: Loss was due to
java.lang.OutOfMemoryError
java.lang.OutOfMemoryError: Java heap space
        at
org.apache.hadoop.io.WritableUtils.readCompressedStringArray(WritableUtils.java:183)
        at 
org.apache.hadoop.conf.Configuration.readFields(Configuration.java:2378)
        at 
org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285)
        at 
org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:77)
        at
org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:39)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1891)
        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
        at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
        at 
org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:165)
        at
org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:56)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1891)
        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
        at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
        at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
        at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)

appreciate your help



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/how-to-set-spark-executor-memory-and-heap-size-tp4719.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

how to set spark.executor.memory and heap size

Reply via email to