Hi Malik, This is a topic I've been investigating as well.
Given how EMR manages its clusters & their runtime, I don't think hacking configs to make the PredictionIO host act like a cluster member will be a simple or sustainable approach. PredictionIO already operates Spark by building `spark-submit` commands. https://github.com/apache/predictionio/blob/df406bf92463da4a79c8d84ec0ca439feaa0ec7f/tools/src/main/scala/org/apache/predictionio/tools/Runner.scala#L313 Implementing a new AWS EMR command runner in PredictionIO, so that we can switch `pio train` from the existing, plain `spark-submit` command to using the AWS CLI, `aws emr add-steps --steps Args=spark-submit` would likely solve a big part of this problem. https://docs.aws.amazon.com/cli/latest/reference/emr/add-steps.html Also, uploading the engine assembly JARs (the job code to run on Spark) to the cluster members or S3 for access from the EMR Spark runtime will be another part of this challenge. On Mon, Feb 5, 2018 at 5:29 AM, Malik Twain <[email protected]> wrote: > I'm trying to run pio train with Amazon EMR. I copied core-site.xml and > yarn-site.xml from EMR to my training machine, and configured > HADOOP_CONF_DIR in pio-env.sh accordingly. > > I'm running pio train as below: > > pio train -- --master yarn --deploy-mode cluster > > It's failing with the following errors: > > 18/02/05 11:56:15 INFO Client: > client token: N/A > diagnostics: Application application_1517819705059_0007 failed 2 times > due to AM Container for appattempt_1517819705059_0007_000002 exited with > exitCode: 1 > Diagnostics: Exception from container-launch. > > And below are the errors from EMR stdout and stderr respectively: > > java.io.FileNotFoundException: /root/pio.log (Permission denied) > > [ERROR] [CreateWorkflow$] Error reading from file: File > file:/quickstartapp/MyExample/engine.json does not exist. Aborting workflow. > > > Thank you. > -- *Mars Hall 415-818-7039 Customer Facing Architect Salesforce Platform / Heroku San Francisco, California
