Re: why "Shuffle Write" is not zero when everything is cached and there is enough memory?

shahab Mon, 30 Mar 2015 04:03:08 -0700

Thanks Saisai. I will try your solution, but still i don't understand why
filesystem should be used where there is a plenty of memory available!




On Mon, Mar 30, 2015 at 11:22 AM, Saisai Shao <sai.sai.s...@gmail.com>
wrote:

> Shuffle write will finally spill the data into file system as a bunch of
> files. If you want to avoid disk write, you can mount a ramdisk and
> configure "spark.local.dir" to this ram disk. So shuffle output will write
> to memory based FS, and will not introduce disk IO.
>
> Thanks
> Jerry
>
> 2015-03-30 17:15 GMT+08:00 shahab <shahab.mok...@gmail.com>:
>
>> Hi,
>>
>> I was looking at SparkUI, Executors, and I noticed that I have 597 MB for
>>  "Shuffle while I am using cached temp-table and the Spark had 2 GB free
>> memory (the number under Memory Used is 597 MB /2.6 GB) ?!!!
>>
>> Shouldn't be Shuffle Write be zero and everything (map/reduce) tasks be
>> done in memory?
>>
>> best,
>>
>> /Shahab
>>
>
>

Re: why "Shuffle Write" is not zero when everything is cached and there is enough memory?

Reply via email to