Hi, The best part about Spark is that it is showing you which configuration to tweak as well. In case you are using EMR, try to see that the configuration points to the right location in the cluster "spark.local.dir". If a disk is mounted across all the systems with a common path (you can do that easily in EMR) then you can change the configuration to point to that disk location and thereby overcome the issue.
On another note also try to see why the data is being written to the disk, is it too much shuffle, can you increase the shuffle memory as shown in the error message using "spark.shuffle.memoryFraction"? By any change have you changed from caching to persistent data frames? Regards, Gourav Sengupta On Tue, Aug 21, 2018 at 12:04 PM Vitaliy Pisarev < vitaliy.pisa...@biocatch.com> wrote: > The other time when I encountered this I solved it by throwing more > resources at it (stronger cluster). > I was not able to understand the root cause though. I'll be happy to hear > deeper insight as well. > > On Mon, Aug 20, 2018 at 7:08 PM, Steve Lewis <lordjoe2...@gmail.com> > wrote: > >> >> We are trying to run a job that has previously run on Spark 1.3 on a >> different cluster. The job was converted to 2.3 spark and this is a new >> cluster. >> >> The job dies after completing about a half dozen stages with >> >> java.io.IOException: No space left on device >> >> >> It appears that the nodes are using local storage as tmp. >> >> >> I could use help diagnosing the issue and how to fix it. >> >> >> Here are the spark conf properties >> >> Spark Conf Properties >> spark.driver.extraJavaOptions=-Djava.io.tmpdir=/scratch/home/int/eva/zorzan/sparktmp/ >> spark.master=spark://10.141.0.34:7077 >> spark.mesos.executor.memoryOverhead=3128 >> spark.shuffle.consolidateFiles=true >> spark.shuffle.spill=falsespark.app.name=Anonymous >> spark.shuffle.manager=sort >> spark.storage.memoryFraction=0.3 >> spark.jars=file:/home/int/eva/zorzan/bin/SparkHydraV2-master/HydraSparkBuilt.jar >> spark.ui.killEnabled=true >> spark.shuffle.spill.compress=true >> spark.shuffle.sort.bypassMergeThreshold=100 >> com.lordjoe.distributed.marker_property=spark_property_set >> spark.executor.memory=12g >> spark.mesos.coarse=true >> spark.shuffle.memoryFraction=0.4 >> spark.serializer=org.apache.spark.serializer.KryoSerializer >> spark.kryo.registrator=com.lordjoe.distributed.hydra.HydraKryoSerializer >> spark.default.parallelism=360 >> spark.io.compression.codec=lz4 >> spark.reducer.maxMbInFlight=128 >> spark.hadoop.validateOutputSpecs=false >> spark.submit.deployMode=client >> spark.local.dir=/scratch/home/int/eva/zorzan/sparktmp >> spark.shuffle.file.buffer.kb=1024 >> >> >> >> -- >> Steven M. Lewis PhD >> 4221 105th Ave NE >> <https://maps.google.com/?q=4221+105th+Ave+NE+Kirkland,+WA+98033&entry=gmail&source=g> >> Kirkland, WA 98033 >> <https://maps.google.com/?q=4221+105th+Ave+NE+Kirkland,+WA+98033&entry=gmail&source=g> >> 206-384-1340 (cell) >> Skype lordjoe_com >> >> >