This line > 13/11/19 23:17:20 INFO MemoryStore: MemoryStore started with capacity 323.9 MB
looks like what you'd get if you haven't set spark.executor.memory (or SPARK_MEM). Without setting it you'll get the default to 512m per executor and .66 of that for the cache. -Ewen ----- Ewen Cheslack-Postava StraightUp | http://readstraightup.com [email protected] (201) 286-7785 On Tue, Nov 19, 2013 at 3:54 PM, Gary Malouf <[email protected]> wrote: > To explain more, we upgraded from 0.7.3 to 0.9 incubating snapshot today and > are getting out of memory errors very quickly even though our cluster has > plenty of RAM and the data is relatively small: > > Using Scala version 2.9.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_21) > Initializing interpreter... > Creating SparkContext... > 13/11/19 23:17:20 INFO Slf4jEventHandler: Slf4jEventHandler started > 13/11/19 23:17:20 INFO SparkEnv: Registering BlockManagerMaster > 13/11/19 23:17:20 INFO DiskBlockManager: Created local directory at > /opt/spark/tmp/spark-local-20131119231720-a023 > 13/11/19 23:17:20 INFO MemoryStore: MemoryStore started with capacity 323.9 > MB. > 13/11/19 23:17:20 INFO ConnectionManager: Bound socket to port 11240 with id > = ConnectionManagerId(spark-shell-01,11240) > 13/11/19 23:17:20 INFO BlockManagerMaster: Trying to register BlockManager > 13/11/19 23:17:20 INFO BlockManagerMasterActor$BlockManagerInfo: Registering > block manager spark-shell-01:11240 with 323.9 MB RAM > 13/11/19 23:25:17 INFO BlockManagerMasterActor$BlockManagerInfo: Registering > block manager dn-02:50623 with 1943.0 MB RAM > 13/11/19 23:25:17 INFO BlockManagerMasterActor$BlockManagerInfo: Registering > block manager dn-01:61960 with 1943.0 MB RAM > 13/11/19 23:25:18 INFO BlockManagerMasterActor$BlockManagerInfo: Registering > block manager dn-03:45775 with 1943.0 MB RAM > > I've included memory store output for more information: > > > 13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113598) called with > curMem=0, maxMem=339585269 > 13/11/19 23:40:38 INFO MemoryStore: Block broadcast_0 stored as values to > memory (estimated size 110.9 KB, free 323.7 MB) > 13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=113598, maxMem=339585269 > 13/11/19 23:40:38 INFO MemoryStore: Block broadcast_1 stored as values to > memory (estimated size 111.0 KB, free 323.6 MB) > 13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=227244, maxMem=339585269 > 13/11/19 23:40:38 INFO MemoryStore: Block broadcast_2 stored as values to > memory (estimated size 111.0 KB, free 323.5 MB) > 13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=340890, maxMem=339585269 > 13/11/19 23:40:38 INFO MemoryStore: Block broadcast_3 stored as values to > memory (estimated size 111.0 KB, free 323.4 MB) > 13/11/19 23:40:38 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=454536, maxMem=339585269 > 13/11/19 23:40:38 INFO MemoryStore: Block broadcast_4 stored as values to > memory (estimated size 111.0 KB, free 323.3 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=568182, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_5 stored as values to > memory (estimated size 111.0 KB, free 323.2 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=681828, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_6 stored as values to > memory (estimated size 111.0 KB, free 323.1 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=795474, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_7 stored as values to > memory (estimated size 111.0 KB, free 323.0 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=909120, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_8 stored as values to > memory (estimated size 111.0 KB, free 322.9 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=1022766, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_9 stored as values to > memory (estimated size 111.0 KB, free 322.8 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=1136412, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_10 stored as values to > memory (estimated size 111.0 KB, free 322.7 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=1250058, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_11 stored as values to > memory (estimated size 111.0 KB, free 322.6 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=1363704, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_12 stored as values to > memory (estimated size 111.0 KB, free 322.4 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=1477350, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_13 stored as values to > memory (estimated size 111.0 KB, free 322.3 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=1590996, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_14 stored as values to > memory (estimated size 111.0 KB, free 322.2 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=1704642, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_15 stored as values to > memory (estimated size 111.0 KB, free 322.1 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=1818288, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_16 stored as values to > memory (estimated size 111.0 KB, free 322.0 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=1931934, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_17 stored as values to > memory (estimated size 111.0 KB, free 321.9 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=2045580, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_18 stored as values to > memory (estimated size 111.0 KB, free 321.8 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=2159226, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_19 stored as values to > memory (estimated size 111.0 KB, free 321.7 MB) > 13/11/19 23:40:39 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=2272872, maxMem=339585269 > 13/11/19 23:40:39 INFO MemoryStore: Block broadcast_20 stored as values to > memory (estimated size 111.0 KB, free 321.6 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=2386518, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_21 stored as values to > memory (estimated size 111.0 KB, free 321.5 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=2500164, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_22 stored as values to > memory (estimated size 111.0 KB, free 321.4 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=2613810, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_23 stored as values to > memory (estimated size 111.0 KB, free 321.3 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=2727456, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_24 stored as values to > memory (estimated size 111.0 KB, free 321.1 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=2841102, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_25 stored as values to > memory (estimated size 111.0 KB, free 321.0 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=2954748, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_26 stored as values to > memory (estimated size 111.0 KB, free 320.9 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=3068394, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_27 stored as values to > memory (estimated size 111.0 KB, free 320.8 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=3182040, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_28 stored as values to > memory (estimated size 111.0 KB, free 320.7 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=3295686, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_29 stored as values to > memory (estimated size 111.0 KB, free 320.6 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=3409332, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_30 stored as values to > memory (estimated size 111.0 KB, free 320.5 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=3522978, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_31 stored as values to > memory (estimated size 111.0 KB, free 320.4 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=3636624, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_32 stored as values to > memory (estimated size 111.0 KB, free 320.3 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=3750270, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_33 stored as values to > memory (estimated size 111.0 KB, free 320.2 MB) > 13/11/19 23:40:40 INFO MemoryStore: ensureFreeSpace(113646) called with > curMem=3863916, maxMem=339585269 > 13/11/19 23:40:40 INFO MemoryStore: Block broadcast_34 stored as values to > memory (estimated size 111.0 KB, free 320.1 MB) > 13/11/19 23:40:40 INFO MemoryStore.. > > > > Thanks, > > Gary > > > On Tue, Nov 19, 2013 at 6:22 PM, Gary Malouf <[email protected]> wrote: >> >> We have a 4 node Spark cluster with 3 gigs of ram available per executor >> (via the spark.executor.memory setting). When we run a Spark job, we see >> the following output: >> >> Using Scala version 2.9.3 (Java HotSpot(TM) 64-Bit Server VM, Java >> 1.7.0_21) >> Initializing interpreter... >> Creating SparkContext... >> 13/11/19 23:17:20 INFO Slf4jEventHandler: Slf4jEventHandler started >> 13/11/19 23:17:20 INFO SparkEnv: Registering BlockManagerMaster >> 13/11/19 23:17:20 INFO DiskBlockManager: Created local directory at >> /opt/spark/tmp/spark-local-20131119231720-a023 >> 13/11/19 23:17:20 INFO MemoryStore: MemoryStore started with capacity >> 323.9 MB. >> 13/11/19 23:17:20 INFO ConnectionManager: Bound socket to port 11240 with >> id = ConnectionManagerId(spark-shell-01,11240) >> 13/11/19 23:17:20 INFO BlockManagerMaster: Trying to register BlockManager >> 13/11/19 23:17:20 INFO BlockManagerMasterActor$BlockManagerInfo: >> Registering block manager spark-shell-01:11240 with 323.9 MB RAM >> >> Is this right? I feel like much more RAM should be available. > >
