Documentation says that 'spark.shuffle.memoryFraction' was deprecated, but
it doesn't say what to use instead. Any idea?

On Wed, Aug 22, 2018 at 9:36 AM, Gourav Sengupta <gourav.sengu...@gmail.com>
wrote:

> Hi,
>
> The best part about Spark is that it is showing you which configuration to
> tweak as well. In case you are using EMR, try to see that the configuration
> points to the right location in the cluster "spark.local.dir". If a disk
> is mounted across all the systems with a common path (you can do that
> easily in EMR) then you can change the configuration to point to that disk
> location and thereby overcome the issue.
>
> On another note also try to see why the data is being written to the disk,
> is it too much shuffle, can you increase the shuffle memory as shown in the
> error message using "spark.shuffle.memoryFraction"?
>
> By any change have you changed from caching to persistent data frames?
>
>
> Regards,
> Gourav Sengupta
>
>
>
> On Tue, Aug 21, 2018 at 12:04 PM Vitaliy Pisarev <
> vitaliy.pisa...@biocatch.com> wrote:
>
>> The other time when I encountered this I solved it by throwing more
>> resources at it (stronger cluster).
>> I was not able to understand the root cause though. I'll be happy to hear
>> deeper insight as well.
>>
>> On Mon, Aug 20, 2018 at 7:08 PM, Steve Lewis <lordjoe2...@gmail.com>
>> wrote:
>>
>>>
>>> We are trying to run a job that has previously run on Spark 1.3 on a 
>>> different cluster. The job was converted to 2.3 spark and this is a new 
>>> cluster.
>>>
>>>     The job dies after completing about a half dozen stages with
>>>
>>> java.io.IOException: No space left on device
>>>
>>>
>>>    It appears that the nodes are using local storage as tmp.
>>>
>>>
>>>     I could use help diagnosing the issue and how to fix it.
>>>
>>>
>>> Here are the spark conf properties
>>>
>>> Spark Conf Properties
>>> spark.driver.extraJavaOptions=-Djava.io.tmpdir=/scratch/home/int/eva/zorzan/sparktmp/
>>> spark.master=spark://10.141.0.34:7077
>>> spark.mesos.executor.memoryOverhead=3128
>>> spark.shuffle.consolidateFiles=true
>>> spark.shuffle.spill=falsespark.app.name=Anonymous
>>> spark.shuffle.manager=sort
>>> spark.storage.memoryFraction=0.3
>>> spark.jars=file:/home/int/eva/zorzan/bin/SparkHydraV2-master/HydraSparkBuilt.jar
>>> spark.ui.killEnabled=true
>>> spark.shuffle.spill.compress=true
>>> spark.shuffle.sort.bypassMergeThreshold=100
>>> com.lordjoe.distributed.marker_property=spark_property_set
>>> spark.executor.memory=12g
>>> spark.mesos.coarse=true
>>> spark.shuffle.memoryFraction=0.4
>>> spark.serializer=org.apache.spark.serializer.KryoSerializer
>>> spark.kryo.registrator=com.lordjoe.distributed.hydra.HydraKryoSerializer
>>> spark.default.parallelism=360
>>> spark.io.compression.codec=lz4
>>> spark.reducer.maxMbInFlight=128
>>> spark.hadoop.validateOutputSpecs=false
>>> spark.submit.deployMode=client
>>> spark.local.dir=/scratch/home/int/eva/zorzan/sparktmp
>>> spark.shuffle.file.buffer.kb=1024
>>>
>>>
>>>
>>> --
>>> Steven M. Lewis PhD
>>> 4221 105th Ave NE
>>> <https://maps.google.com/?q=4221+105th+Ave+NE+Kirkland,+WA+98033&entry=gmail&source=g>
>>> Kirkland, WA 98033
>>> <https://maps.google.com/?q=4221+105th+Ave+NE+Kirkland,+WA+98033&entry=gmail&source=g>
>>> 206-384-1340 (cell)
>>> Skype lordjoe_com
>>>
>>>
>>

Reply via email to