Hi Wisely, I have 26gb for driver and the master is running on m3.2xlarge machines.
I see OOM errors on workers and even they are running with 26th of memory. Thanks On Fri, Mar 27, 2015, 11:43 PM Wisely Chen <wiselyc...@appier.com> wrote: > Hi > > In broadcast, spark will collect the whole 3gb object into master node and > broadcast to each slaves. It is very common situation that the master node > don't have enough memory . > > What is your master node settings? > > Wisely Chen > > Ankur Srivastava <ankur.srivast...@gmail.com> 於 2015年3月28日 星期六寫道: > > I have increased the "spark.storage.memoryFraction" to 0.4 but I still >> get OOM errors on Spark Executor nodes >> >> >> 15/03/27 23:19:51 INFO BlockManagerMaster: Updated info of block >> broadcast_5_piece10 >> >> 15/03/27 23:19:51 INFO TorrentBroadcast: Reading broadcast variable 5 >> took 2704 ms >> >> 15/03/27 23:19:52 INFO MemoryStore: ensureFreeSpace(672530208) called >> with curMem=2484698683, maxMem=9631778734 >> >> 15/03/27 23:19:52 INFO MemoryStore: Block broadcast_5 stored as values in >> memory (estimated size 641.4 MB, free 6.0 GB) >> >> 15/03/27 23:34:02 WARN AkkaUtils: Error sending message in 1 attempts >> >> java.util.concurrent.TimeoutException: Futures timed out after [30 >> seconds] >> >> at >> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) >> >> at >> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) >> >> at >> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) >> >> at >> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) >> >> at scala.concurrent.Await$.result(package.scala:107) >> >> at >> org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:187) >> >> at >> org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:407) >> >> 15/03/27 23:34:02 ERROR Executor: Exception in task 7.0 in stage 2.0 (TID >> 4007) >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1986) >> >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >> >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> >> Thanks >> >> Ankur >> >> On Fri, Mar 27, 2015 at 2:52 PM, Ankur Srivastava < >> ankur.srivast...@gmail.com> wrote: >> >>> Hi All, >>> >>> I am running a spark cluster on EC2 instances of type: m3.2xlarge. I >>> have given 26gb of memory with all 8 cores to my executors. I can see that >>> in the logs too: >>> >>> *15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added: >>> app-20150327213106-0000/0 on worker-20150327212934-10.x.y.z-40128 >>> (10.x.y.z:40128) with 8 cores* >>> >>> I am not caching any RDD so I have set "spark.storage.memoryFraction" to >>> 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB. >>> >>> I am now confused with these logs? >>> >>> *15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block >>> manager 10.77.100.196:58407 <http://10.77.100.196:58407> with 4.5 GB RAM, >>> BlockManagerId(4, 10.x.y.z, 58407)* >>> >>> I am broadcasting a large object of 3 gb and after that when I am >>> creating an RDD, I see logs which show this 4.5 GB memory getting full and >>> then I get OOM. >>> >>> How can I make block manager use more memory? >>> >>> Is there any other fine tuning I need to do for broadcasting large >>> objects? >>> >>> And does broadcast variable use cache memory or rest of the heap? >>> >>> >>> Thanks >>> >>> Ankur >>> >> >>