Re: java.io.IOException: No space left on device--regd.

2015-07-06 Thread Akhil Das
While the job is running, just look in the directory and see whats the root cause of it (is it the logs? is it the shuffle? etc). Here's a few configuration options which you can try: - Disable shuffle : spark.shuffle.spill=false (It might end up in OOM) - Enable log rotation:

Re: java.io.IOException: No space left on device--regd.

2015-07-06 Thread Akhil Das
You can also set these in the spark-env.sh file : export SPARK_WORKER_DIR=/mnt/spark/ export SPARK_LOCAL_DIR=/mnt/spark/ Thanks Best Regards On Mon, Jul 6, 2015 at 12:29 PM, Akhil Das ak...@sigmoidanalytics.com wrote: While the job is running, just look in the directory and see whats the

java.io.IOException: No space left on device--regd.

2015-07-05 Thread Devarajan Srinivasan
Hi , I am trying to run an ETL on spark which involves expensive shuffle operation. Basically I require a self-join to be performed on a sparkDataFrame RDD . The job runs fine for around 15 hours and when the stage(which performs the sef-join) is about to complete, I get a *java.io.IOException: