You do not need to place it in every local directory of every node. Just use 
hadoop fs -put to put it on HDFS. Alternatively as others suggested use s3

> On 28 Feb 2017, at 02:18, Yunjie Ji <jyjah...@163.com> wrote:
> 
> After start the dfs, yarn and spark, I run these code under the root
> directory of spark on my master host: 
> `MASTER=yarn ./bin/run-example ml.LogisticRegressionExample
> data/mllib/sample_libsvm_data.txt`
> 
> Actually I get these code from spark's README. And here is the source code
> about LogisticRegressionExample on GitHub: 
> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/LogisticRegressionExample.scala
> <https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/LogisticRegressionExample.scala>
>   
> 
> Then, error occurs: 
> `Exception in thread "main" org.apache.spark.sql.AnalysisException: Path
> does notexist:
> hdfs://master:9000/user/root/data/mllib/sample_libsvm_data.txt;`
> 
> Firstly, I don't know why it's `hdfs://master:9000/user/root`, I do set
> namenode's IP address to `hdfs://master:9000`, but why spark chose the
> directory `/user/root`?
> 
> Then, I make a directory `/user/root/data/mllib/sample_libsvm_data.txt` on
> every host of the cluster, so I hope spark can find this file. But the same
> error occurs again. 
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Run-spark-machine-learning-example-on-Yarn-failed-tp28435.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to