Makes sense. Thank you.
On Sun, Feb 23, 2014 at 9:57 PM, Matei Zaharia <matei.zaha...@gmail.com>wrote: > Good catch; the Spark cluster on EC2 is configured to use HDFS as its > default filesystem, so it can't find this file. The quick start was written > to run on a single machine with an out-of-the-box install. If you'd like to > upload this file to the HDFS cluster on EC2, use the following command: > > ~/ephemeral-hdfs/bin/hadoop fs -put README.md README.md > > Matei > > On Feb 23, 2014, at 6:33 PM, nicholas.chammas <nicholas.cham...@gmail.com> > wrote: > > I just deployed Spark 0.9.0 to EC2 using the guide > here<http://spark.incubator.apache.org/docs/latest/ec2-scripts.html>. > I then turned to the Quick Start guide > here<http://spark.incubator.apache.org/docs/latest/quick-start.html> and > walked through it using the Python shell. > > When I do this: > > >>> textFile = sc.textFile("README.md") > >>> textFile.count() > > > I get a long error output right after the count() that includes this: > > org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: > hdfs:// > ec2-my-node-address.compute-1.amazonaws.com:9000/user/root/README.md > > So I guess Spark assumed that the file was in HDFS. > > To get the file open and count to work, I had to do this: > > >>> textFile = sc.textFile("file:///root/spark/README.md") > >>> textFile.count() > > > I get the same results if I use the Scala shell. > > Does the quick start guide need to updated, or did I miss something? > > Nick > > > ------------------------------ > View this message in context: Spark Quick Start - call to open README.md > needs explicit fs > prefix<http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Quick-Start-call-to-open-README-md-needs-explicit-fs-prefix-tp1952.html> > Sent from the Apache Spark User List mailing list > archive<http://apache-spark-user-list.1001560.n3.nabble.com/>at > Nabble.com. > > >