Re: Questions about the files that Spark will produce during its running

2013-10-28 Thread Shangyu Luo
Yes, I broadcast the spark-env.sh file to all worker nodes before I run my program and then execute bin/stop-all.sh, bin/start-all.sh. I have also viewed the size of data2 directory on each worker node and it is also about 800G. Thanks! 2013/10/29 Matei Zaharia > The error is from a worker node

Re: Questions about the files that Spark will produce during its running

2013-10-28 Thread Matei Zaharia
The error is from a worker node -- did you check that /data2 is set up properly on the worker nodes too? In general that should be the only directory used. Matei On Oct 28, 2013, at 6:52 PM, Shangyu Luo wrote: > Hello, > I have some questions about the files that Spark will create and use duri

Questions about the files that Spark will produce during its running

2013-10-28 Thread Shangyu Luo
Hello, I have some questions about the files that Spark will create and use during its running. (1) I am running a python program on Spark with a cluster of EC2. The data comes from hdfs file system. I have met the following error in the console of the master node: *java.io.FileNotFoundException: