In our spark-env.sh, we have: export MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos.so
export ADD_JARS=/opt/spark/mx-lib/verrazano_2.9.3-0.1-SNAPSHOT-assembly.jar if [ -z "$SPARK_JAVA_OPTS" ] ; then SPARK_JAVA_OPTS="-Xss20m -Dspark.local.dir=/opt/spark/tmp -Dspark.executor.memory=3g -Dspark.serializer=org.apache.spark.serializer.KryoSerializer -Dspark.kryo.registrator=com.mediacrossing.verrazano.kryo.MxDataRegistrator" fi # This is a workaround for https://spark-project.atlassian.net/browse/SPARK-896 if [ -z "$SPARK_CLASSPATH" ] ; then SPARK_CLASSPATH=$ADD_JARS fi This is set on both the shell and all of the slaves. On Tue, Nov 19, 2013 at 7:09 PM, Ewen Cheslack-Postava <[email protected]>wrote: > This line > > > 13/11/19 23:17:20 INFO MemoryStore: MemoryStore started with capacity > 323.9 MB > > looks like what you'd get if you haven't set spark.executor.memory (or > SPARK_MEM). Without setting it you'll get the default to 512m per > executor and .66 of that for the cache. > > -Ewen > ----- > Ewen Cheslack-Postava > StraightUp | http://readstraightup.com > [email protected] > (201) 286-7785 > > > On Tue, Nov 19, 2013 at 3:54 PM, Gary Malouf <[email protected]> > wrote: > > To explain more, we upgraded from 0.7.3 to 0.9 incubating snapshot today > and > > are getting out of memory errors very quickly even though our cluster has > > plenty of RAM and the data is relatively small: > > > > Using Scala version 2.9.3 (Java HotSpot(TM) 64-Bit Server VM, Java > 1.7.0_21) > > Initializing interpreter... > > Creating SparkContext... > > 13/11/19 23:17:20 INFO Slf4jEventHandler: Slf4jEventHandler started > > 13/11/19 23:17:20 INFO SparkEnv: Registering BlockManagerMaster > > 13/11/19 23:17:20 INFO DiskBlockManager: Created local directory at > > /opt/spark/tmp/spark-local-20131119231720-a023 > > 13/11/19 23:17:20 INFO MemoryStore: MemoryStore started with capacity > 323.9 > > MB. > > 13/11/19 23:17:20 INFO ConnectionManager: Bound socket to port 11240 > with id > > = ConnectionManagerId(spark-shell-01,11240) > > 13/11/19 23:17:20 INFO BlockManagerMaster: Trying to register > BlockManager > > 13/11/19 23:17:20 INFO BlockManagerMasterActor$BlockManagerInfo: > Registering > > block manager spark-shell-01:11240 with 323.9 MB RAM > > 13/11/19 23:25:17 INFO BlockManagerMasterActor$BlockManagerInfo: > Registering > > block manager dn-02:50623 with 1943.0 MB RAM > > 13/11/19 23:25:17 INFO BlockManagerMasterActor$BlockManagerInfo: > Registering > > block manager dn-01:61960 with 1943.0 MB RAM > > 13/11/19 23:25:18 INFO BlockManagerMasterActor$BlockManagerInfo: > Registering > > block manager dn-03:45775 with 1943.0 MB RAM > > > > I've included memory store output for more information: > > > > > > 13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113598) called with > > curMem=0, maxMem=339585269 > > 13/11/19 23:40:38 INFO MemoryStore: Block broadcast_0 stored as values to > > memory (estimated size 110.9 KB, free 323.7 MB) > > 13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=113598, maxMem=339585269 > > 13/11/19 23:40:38 INFO MemoryStore: Block broadcast_1 stored as values to > > memory (estimated size 111.0 KB, free 323.6 MB) > > 13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=227244, maxMem=339585269 > > 13/11/19 23:40:38 INFO MemoryStore: Block broadcast_2 stored as values to > > memory (estimated size 111.0 KB, free 323.5 MB) > > 13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=340890, maxMem=339585269 > > 13/11/19 23:40:38 INFO MemoryStore: Block broadcast_3 stored as values to > > memory (estimated size 111.0 KB, free 323.4 MB) > > 13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=454536, maxMem=339585269 > > 13/11/19 23:40:38 INFO MemoryStore: Block broadcast_4 stored as values to > > memory (estimated size 111.0 KB, free 323.3 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=568182, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_5 stored as values to > > memory (estimated size 111.0 KB, free 323.2 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=681828, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_6 stored as values to > > memory (estimated size 111.0 KB, free 323.1 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=795474, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_7 stored as values to > > memory (estimated size 111.0 KB, free 323.0 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=909120, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_8 stored as values to > > memory (estimated size 111.0 KB, free 322.9 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=1022766, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_9 stored as values to > > memory (estimated size 111.0 KB, free 322.8 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=1136412, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_10 stored as values > to > > memory (estimated size 111.0 KB, free 322.7 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=1250058, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_11 stored as values > to > > memory (estimated size 111.0 KB, free 322.6 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=1363704, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_12 stored as values > to > > memory (estimated size 111.0 KB, free 322.4 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=1477350, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_13 stored as values > to > > memory (estimated size 111.0 KB, free 322.3 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=1590996, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_14 stored as values > to > > memory (estimated size 111.0 KB, free 322.2 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=1704642, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_15 stored as values > to > > memory (estimated size 111.0 KB, free 322.1 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=1818288, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_16 stored as values > to > > memory (estimated size 111.0 KB, free 322.0 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=1931934, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_17 stored as values > to > > memory (estimated size 111.0 KB, free 321.9 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=2045580, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_18 stored as values > to > > memory (estimated size 111.0 KB, free 321.8 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=2159226, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_19 stored as values > to > > memory (estimated size 111.0 KB, free 321.7 MB) > > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=2272872, maxMem=339585269 > > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_20 stored as values > to > > memory (estimated size 111.0 KB, free 321.6 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=2386518, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_21 stored as values > to > > memory (estimated size 111.0 KB, free 321.5 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=2500164, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_22 stored as values > to > > memory (estimated size 111.0 KB, free 321.4 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=2613810, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_23 stored as values > to > > memory (estimated size 111.0 KB, free 321.3 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=2727456, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_24 stored as values > to > > memory (estimated size 111.0 KB, free 321.1 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=2841102, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_25 stored as values > to > > memory (estimated size 111.0 KB, free 321.0 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=2954748, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_26 stored as values > to > > memory (estimated size 111.0 KB, free 320.9 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=3068394, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_27 stored as values > to > > memory (estimated size 111.0 KB, free 320.8 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=3182040, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_28 stored as values > to > > memory (estimated size 111.0 KB, free 320.7 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=3295686, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_29 stored as values > to > > memory (estimated size 111.0 KB, free 320.6 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=3409332, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_30 stored as values > to > > memory (estimated size 111.0 KB, free 320.5 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=3522978, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_31 stored as values > to > > memory (estimated size 111.0 KB, free 320.4 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=3636624, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_32 stored as values > to > > memory (estimated size 111.0 KB, free 320.3 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=3750270, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_33 stored as values > to > > memory (estimated size 111.0 KB, free 320.2 MB) > > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > > curMem=3863916, maxMem=339585269 > > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_34 stored as values > to > > memory (estimated size 111.0 KB, free 320.1 MB) > > 13/11/19 23:40:40 INFO MemoryStore.. > > > > > > > > Thanks, > > > > Gary > > > > > > On Tue, Nov 19, 2013 at 6:22 PM, Gary Malouf <[email protected]> > wrote: > >> > >> We have a 4 node Spark cluster with 3 gigs of ram available per executor > >> (via the spark.executor.memory setting). When we run a Spark job, we > see > >> the following output: > >> > >> Using Scala version 2.9.3 (Java HotSpot(TM) 64-Bit Server VM, Java > >> 1.7.0_21) > >> Initializing interpreter... > >> Creating SparkContext... > >> 13/11/19 23:17:20 INFO Slf4jEventHandler: Slf4jEventHandler started > >> 13/11/19 23:17:20 INFO SparkEnv: Registering BlockManagerMaster > >> 13/11/19 23:17:20 INFO DiskBlockManager: Created local directory at > >> /opt/spark/tmp/spark-local-20131119231720-a023 > >> 13/11/19 23:17:20 INFO MemoryStore: MemoryStore started with capacity > >> 323.9 MB. > >> 13/11/19 23:17:20 INFO ConnectionManager: Bound socket to port 11240 > with > >> id = ConnectionManagerId(spark-shell-01,11240) > >> 13/11/19 23:17:20 INFO BlockManagerMaster: Trying to register > BlockManager > >> 13/11/19 23:17:20 INFO BlockManagerMasterActor$BlockManagerInfo: > >> Registering block manager spark-shell-01:11240 with 323.9 MB RAM > >> > >> Is this right? I feel like much more RAM should be available. > > > > >
