According to:
http://blog.erdemagaoglu.com/post/4605524309/lzo-vs-snappy-vs-lzf-vs-zlib-a-comparison-of

performance of snappy and lzf were on-par to each other.

Maybe lzf has lower memory requirement.

On Wed, May 18, 2016 at 7:22 AM, Serega Sheypak <serega.shey...@gmail.com>
wrote:

> Switching from snappy to lzf helped me:
>
> *spark.io.compression.codec=lzf*
>
> Do you know why? :) I can't find exact explanation...
>
>
>
> 2016-05-18 15:41 GMT+02:00 Ted Yu <yuzhih...@gmail.com>:
>
>> Please increase the number of partitions.
>>
>> Cheers
>>
>> On Wed, May 18, 2016 at 4:17 AM, Serega Sheypak <serega.shey...@gmail.com
>> > wrote:
>>
>>> Hi, please have a look at log snippet:
>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
>>> tracker endpoint =
>>> NettyRpcEndpointRef(spark://mapoutputtrac...@xxx.xxx.xxx.xxx:38128)
>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
>>> locations
>>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Getting 30
>>> non-empty blocks out of 30 blocks
>>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Started 30
>>> remote fetches in 3 ms
>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Don't have map
>>> outputs for shuffle 1, fetching them
>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
>>> tracker endpoint =
>>> NettyRpcEndpointRef(spark://mapoutputtrac...@xxx.xxx.xxx.xxx:38128)
>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
>>> locations
>>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Getting 1
>>> non-empty blocks out of 1500 blocks
>>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Started 1
>>> remote fetches in 1 ms
>>> 16/05/18 03:27:17 ERROR executor.Executor: Managed memory leak detected;
>>> size = 6685476 bytes, TID = 3405
>>> 16/05/18 03:27:17 ERROR executor.Executor: Exception in task 285.0 in
>>> stage 6.0 (TID 3405)
>>>
>>> Is it related to https://issues.apache.org/jira/browse/SPARK-11293
>>>
>>> Is there any recommended workaround?
>>>
>>
>>
>

Reply via email to