Ok, it happens only in YARN+cluster mode. It works with snappy in
YARN+client mode.
I've started to hit this problem when I switched to cluster mode.
2016-05-18 16:31 GMT+02:00 Ted Yu :
> According to:
>
>
According to:
http://blog.erdemagaoglu.com/post/4605524309/lzo-vs-snappy-vs-lzf-vs-zlib-a-comparison-of
performance of snappy and lzf were on-par to each other.
Maybe lzf has lower memory requirement.
On Wed, May 18, 2016 at 7:22 AM, Serega Sheypak
wrote:
> Switching
Switching from snappy to lzf helped me:
*spark.io.compression.codec=lzf*
Do you know why? :) I can't find exact explanation...
2016-05-18 15:41 GMT+02:00 Ted Yu :
> Please increase the number of partitions.
>
> Cheers
>
> On Wed, May 18, 2016 at 4:17 AM, Serega Sheypak
Please increase the number of partitions.
Cheers
On Wed, May 18, 2016 at 4:17 AM, Serega Sheypak
wrote:
> Hi, please have a look at log snippet:
> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
> tracker endpoint =
>
Hi, please have a look at log snippet:
16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
tracker endpoint =
NettyRpcEndpointRef(spark://mapoutputtrac...@xxx.xxx.xxx.xxx:38128)
16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
locations
16/05/18 03:27:16 INFO