this in the spark-ec2 script.
Writing lots of tmp files in the 8GB `/` is not a great idea.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/No-space-left-on-device-error-when-pulling-data-from-s3-tp5450p5518.html
Sent from the Apache Spark User List mailing list archive
Hi,
I've a `no space left on device` exception when pulling some 22GB data from
s3 block storage to the ephemeral HDFS. The cluster is on EC2 using
spark-ec2 script with 4 m1.large.
The code is basically:
val in = sc.textFile(s3://...)
in.saveAsTextFile(hdfs://...)
Spark creates 750 input
I wonder why is your / is full. Try clearing out /tmp and also make sure in
the spark-env.sh you have put SPARK_JAVA_OPTS+=
-Dspark.local.dir=/mnt/spark
Thanks
Best Regards
On Tue, May 6, 2014 at 9:35 PM, Han JU ju.han.fe...@gmail.com wrote:
Hi,
I've a `no space left on device` exception
After some investigation, I found out that there's lots of temp files under
/tmp/hadoop-root/s3/
But this is strange since in both conf files,
~/ephemeral-hdfs/conf/core-site.xml and ~/spark/conf/core-site.xml, the
setting `hadoop.tmp.dir` is set to `/mnt/ephemeral-hdfs/`. Why spark jobs
still