I believe you want to set memoryFraction higher, not lower.  These two
older threads seem to have similar issues you are experiencing:

https://mail-archives.apache.org/mod_mbox/spark-user/201503.mbox/%3CCAHUQ+_ZqaWFs_MJ=+V49bD2paKvjLErPKMEW5duLO1jAo4=d...@mail.gmail.com%3E
https://www.mail-archive.com/user@spark.apache.org/msg44793.html

More info on tuning shuffle behavior:
https://spark.apache.org/docs/1.5.1/configuration.html#shuffle-behavior

On Thu, Jun 16, 2016 at 1:57 PM, Cassa L <lcas...@gmail.com> wrote:

> Hi Dennis,
>
> On Wed, Jun 15, 2016 at 11:39 PM, Dennis Lovely <d...@aegisco.com> wrote:
>
>> You could try tuning spark.shuffle.memoryFraction and
>> spark.storage.memoryFraction (both of which have been deprecated in 1.6),
>> but ultimately you need to find out where you are bottlenecked and address
>> that as adjusting memoryFraction will only be a stopgap.  both shuffle and
>> storage memoryFractions default to 0.6
>>
>> I have set above parameters to 0.5. Does it need to increased?
>
> Thanks.
>
>> On Wed, Jun 15, 2016 at 9:37 PM, Cassa L <lcas...@gmail.com> wrote:
>>
>>> Hi,
>>>  I did set  --driver-memory 4G. I still run into this issue after 1
>>> hour of data load.
>>>
>>> I also tried version 1.6 in test environment. I hit this issue much
>>> faster than in 1.5.1 setup.
>>> LCassa
>>>
>>> On Tue, Jun 14, 2016 at 3:57 PM, Gaurav Bhatnagar <gaura...@gmail.com>
>>> wrote:
>>>
>>>> try setting the option --driver-memory 4G
>>>>
>>>> On Tue, Jun 14, 2016 at 3:52 PM, Ben Slater <ben.sla...@instaclustr.com
>>>> > wrote:
>>>>
>>>>> A high level shot in the dark but in our testing we found Spark 1.6 a
>>>>> lot more reliable in low memory situations (presumably due to
>>>>> https://issues.apache.org/jira/browse/SPARK-10000). If it’s an
>>>>> option, probably worth a try.
>>>>>
>>>>> Cheers
>>>>> Ben
>>>>>
>>>>> On Wed, 15 Jun 2016 at 08:48 Cassa L <lcas...@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>> I would appreciate any clue on this. It has become a bottleneck for
>>>>>> our spark job.
>>>>>>
>>>>>> On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lcas...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm using spark 1.5.1 version. I am reading data from Kafka into Spark 
>>>>>>> and writing it into Cassandra after processing it. Spark job starts 
>>>>>>> fine and runs all good for some time until I start getting below 
>>>>>>> errors. Once these errors come, job start to lag behind and I see that 
>>>>>>> job has scheduling and processing delays in streaming  UI.
>>>>>>>
>>>>>>> Worker memory is 6GB, executor-memory is 5GB, I also tried to tweak 
>>>>>>> memoryFraction parameters. Nothing works.
>>>>>>>
>>>>>>>
>>>>>>> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called with 
>>>>>>> curMem=565394, maxMem=2778495713
>>>>>>> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0 stored 
>>>>>>> as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>>>>>>> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable 
>>>>>>> 69652 took 2 ms
>>>>>>> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial memory 
>>>>>>> threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>>>>>>> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache 
>>>>>>> broadcast_69652 in memory! (computed 496.0 B so far)
>>>>>>> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks) + 
>>>>>>> 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit 
>>>>>>> = 2.6 GB.
>>>>>>> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652 to 
>>>>>>> disk instead.
>>>>>>> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1 locally
>>>>>>> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0 
>>>>>>> (TID 452316). 2043 bytes result sent to driver
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> L
>>>>>>>
>>>>>>>
>>>>>> --
>>>>> ————————
>>>>> Ben Slater
>>>>> Chief Product Officer
>>>>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>>>>> +61 437 929 798
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to