Hi Xiangrui, Here is the result on the master node: $ df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/xvda1 524288 273997 250291 53% / tmpfs 1917974 1 1917973 1% /dev/shm /dev/xvdv 524288000 30 524287970 1% /vol
I have reproduced the error while using the MovieLens 10M data set on a newly created cluster. Thanks for the help. Chris On Wed, Jul 16, 2014 at 12:22 AM, Xiangrui Meng <men...@gmail.com> wrote: > Hi Chris, > > Could you also try `df -i` on the master node? How many > blocks/partitions did you set? > > In the current implementation, ALS doesn't clean the shuffle data > because the operations are chained together. But it shouldn't run out > of disk space on the MovieLens dataset, which is small. spark-ec2 > script sets /mnt/spark and /mnt/spark2 as the local.dir by default, I > would recommend leaving this setting as the default value. > > Best, > Xiangrui > > On Wed, Jul 16, 2014 at 12:02 AM, Chris DuBois <chris.dub...@gmail.com> > wrote: > > Thanks for the quick responses! > > > > I used your final -Dspark.local.dir suggestion, but I see this during the > > initialization of the application: > > > > 14/07/16 06:56:08 INFO storage.DiskBlockManager: Created local directory > at > > /vol/spark-local-20140716065608-7b2a > > > > I would have expected something in /mnt/spark/. > > > > Thanks, > > Chris > > > > > > > > On Tue, Jul 15, 2014 at 11:44 PM, Chris Gore <cdg...@cdgore.com> wrote: > >> > >> Hi Chris, > >> > >> I've encountered this error when running Spark’s ALS methods too. In my > >> case, it was because I set spark.local.dir improperly, and every time > there > >> was a shuffle, it would spill many GB of data onto the local drive. > What > >> fixed it was setting it to use the /mnt directory, where a network > drive is > >> mounted. For example, setting an environmental variable: > >> > >> export SPACE=$(mount | grep mnt | awk '{print $3"/spark/"}' | xargs | > sed > >> 's/ /,/g’) > >> > >> Then adding -Dspark.local.dir=$SPACE or simply > >> -Dspark.local.dir=/mnt/spark/,/mnt2/spark/ when you run your driver > >> application > >> > >> Chris > >> > >> On Jul 15, 2014, at 11:39 PM, Xiangrui Meng <men...@gmail.com> wrote: > >> > >> > Check the number of inodes (df -i). The assembly build may create many > >> > small files. -Xiangrui > >> > > >> > On Tue, Jul 15, 2014 at 11:35 PM, Chris DuBois < > chris.dub...@gmail.com> > >> > wrote: > >> >> Hi all, > >> >> > >> >> I am encountering the following error: > >> >> > >> >> INFO scheduler.TaskSetManager: Loss was due to java.io.IOException: > No > >> >> space > >> >> left on device [duplicate 4] > >> >> > >> >> For each slave, df -h looks roughtly like this, which makes the above > >> >> error > >> >> surprising. > >> >> > >> >> Filesystem Size Used Avail Use% Mounted on > >> >> /dev/xvda1 7.9G 4.4G 3.5G 57% / > >> >> tmpfs 7.4G 4.0K 7.4G 1% /dev/shm > >> >> /dev/xvdb 37G 3.3G 32G 10% /mnt > >> >> /dev/xvdf 37G 2.0G 34G 6% /mnt2 > >> >> /dev/xvdv 500G 33M 500G 1% /vol > >> >> > >> >> I'm on an EC2 cluster (c3.xlarge + 5 x m3) that I launched using the > >> >> spark-ec2 scripts and a clone of spark from today. The job I am > running > >> >> closely resembles the collaborative filtering example. This issue > >> >> happens > >> >> with the 1M version as well as the 10 million rating version of the > >> >> MovieLens dataset. > >> >> > >> >> I have seen previous questions, but they haven't helped yet. For > >> >> example, I > >> >> tried setting the Spark tmp directory to the EBS volume at /vol/, > both > >> >> by > >> >> editing the spark conf file (and copy-dir'ing it to the slaves) as > well > >> >> as > >> >> through the SparkConf. Yet I still get the above error. Here is my > >> >> current > >> >> Spark config below. Note that I'm launching via > >> >> ~/spark/bin/spark-submit. > >> >> > >> >> conf = SparkConf() > >> >> conf.setAppName("RecommendALS").set("spark.local.dir", > >> >> "/vol/").set("spark.executor.memory", > "7g").set("spark.akka.frameSize", > >> >> "100").setExecutorEnv("SPARK_JAVA_OPTS", " > -Dspark.akka.frameSize=100") > >> >> sc = SparkContext(conf=conf) > >> >> > >> >> Thanks for any advice, > >> >> Chris > >> >> > >> > > >