Thank you Imran!!

I was able to solve the issue by setting
"spark.storage.blockManagerSlaveTimeoutMs=300000"

As I was seeing some block manager timeouts on master I updated this
setting and it fixed the timeout issue as well as OOM errors on workers
too. I am not really sure how it fixed the OOM but it is now working for me.

Thanks
Ankur

On Mon, Apr 13, 2015 at 8:09 PM, Imran Rashid <iras...@cloudera.com> wrote:

> broadcast variables count towards "spark.storage.memoryFraction", so they
> use the same "pool" of memory as cached RDDs.
>
> That being said, I'm really not sure why you are running into problems, it
> seems like you have plenty of memory available.  Most likely its got
> nothing to do with broadcast variables or caching -- its just whatever
> logic you are applying in your transformations that are causing lots of GC
> to occur during the computation.  Hard to say without knowing more details.
>
> You could try increasing the timeout for the failed askWithReply by
> increasing "spark.akka.lookupTimeout" (defaults to 30), but that would most
> likely be treating a symptom, not the root cause.
>
> On Fri, Mar 27, 2015 at 4:52 PM, Ankur Srivastava <
> ankur.srivast...@gmail.com> wrote:
>
>> Hi All,
>>
>> I am running a spark cluster on EC2 instances of type: m3.2xlarge. I have
>> given 26gb of memory with all 8 cores to my executors. I can see that in
>> the logs too:
>>
>> *15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added:
>> app-20150327213106-0000/0 on worker-20150327212934-10.x.y.z-40128
>> (10.x.y.z:40128) with 8 cores*
>>
>> I am not caching any RDD so I have set "spark.storage.memoryFraction" to
>> 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB.
>>
>> I am now confused with these logs?
>>
>> *15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block
>> manager 10.77.100.196:58407 <http://10.77.100.196:58407> with 4.5 GB RAM,
>> BlockManagerId(4, 10.x.y.z, 58407)*
>>
>> I am broadcasting a large object of 3 gb and after that when I am
>> creating an RDD, I see logs which show this 4.5 GB memory getting full and
>> then I get OOM.
>>
>> How can I make block manager use more memory?
>>
>> Is there any other fine tuning I need to do for broadcasting large
>> objects?
>>
>> And does broadcast variable use cache memory or rest of the heap?
>>
>>
>> Thanks
>>
>> Ankur
>>
>
>

Reply via email to