Hi Amal,
For yarn-cluster mode, please use the --scratch-ufo argument and point it
to a location on HDFS where you have write access to. PIO will copy the
necessary files to the HDFS location for your yarn-cluster Spark driver to
read.
Regards,
Donald
On Tuesday, October 18, 2016, amal kumar wrote:
> Hi,
>
> We are using Spark on yarn and after referring the below URL, we have been
> able to submit the jobs to yarn from remote machine (i.e. PredictionIO
> Server).
>
> http://theckang.com/2015/remote-spark-jobs-on-yarn/
>
> 1. copied core-site.xml and yarn-site.xml from Yarn cluster onto remote
> machine (i.e. PredictionIO Server)
> 2. set the HADOOP_CONF_DIR environment variable in spark-env.sh (locally
> installed copy) on the remote machine to locate the files core-site.xml and
> yarn-site.xml
>
>
> Now, when I am trying to train the model using the below command, I get a
> new error.
>
> pio train -- --master yarn-cluster
>
>
> Error:
> [ERROR] [CreateWorkflow$] Error reading from file: File
> file:/home/user/PredictionIO/SimilarProductRecommendation/engine.json
> does not exist. Aborting workflow
>
>
> I also tried to pass the file path, but no luck.
>
> pio train -- --master yarn-cluster --files file:/home/user/PredictionIO/
> SimilarProductRecommendation/engine.json
>
>
> Thanks,
> Amal
>