Hi Amal, For yarn-cluster mode, please use the --scratch-ufo argument and point it to a location on HDFS where you have write access to. PIO will copy the necessary files to the HDFS location for your yarn-cluster Spark driver to read.
Regards, Donald On Tuesday, October 18, 2016, amal kumar <[email protected]> wrote: > Hi, > > We are using Spark on yarn and after referring the below URL, we have been > able to submit the jobs to yarn from remote machine (i.e. PredictionIO > Server). > > http://theckang.com/2015/remote-spark-jobs-on-yarn/ > > 1. copied core-site.xml and yarn-site.xml from Yarn cluster onto remote > machine (i.e. PredictionIO Server) > 2. set the HADOOP_CONF_DIR environment variable in spark-env.sh (locally > installed copy) on the remote machine to locate the files core-site.xml and > yarn-site.xml > > > Now, when I am trying to train the model using the below command, I get a > new error. > > pio train -- --master yarn-cluster > > > Error: > [ERROR] [CreateWorkflow$] Error reading from file: File > file:/home/user/PredictionIO/SimilarProductRecommendation/engine.json > does not exist. Aborting workflow > > > I also tried to pass the file path, but no luck. > > pio train -- --master yarn-cluster --files file:/home/user/PredictionIO/ > SimilarProductRecommendation/engine.json > > > Thanks, > Amal >
