Hi, I have used --master yarn-client and it is working great. But before this you need to copy Haddop, Yarn and HBase configs to PIO Machine and enable Haddop Dir Path in pio-env.sh
Thanks and Regards Ambuj Sharma Sunrise may late, But Morning is sure..... Team ML Betaout On Fri, Mar 31, 2017 at 11:01 PM, Malay Tripathi <[email protected]> wrote: > 2017-03-31 13:28:57,084 INFO org.apache.predictionio.tools.console.Console$ > [main] - Using existing engine manifest JSON at /home/da_mcom_milan/ > PredictionIO/personalized-complementary/manifest.json > > 2017-03-31 13:28:58,938 INFO org.apache.predictionio.tools.Runner$ > [main] - Submission command: /home/da_mcom_milan/ > PredictionIO/vendors/spark/bin/spark-submit --master yarn-cluster --class > org.apache.predictionio.workflow.CreateWorkflow --jars > file:/home/da_mcom_milan/PredictionIO/personalized- > complementary/target/scala-2.10/template-scala-parallel- > universal-recommendation-assembly-0.5.0-deps.jar,file:/home/da_mcom_milan/ > PredictionIO/personalized-complementary/target/scala-2. > 10/template-scala-parallel-universal-recommendation_2.10-0.5.0.jar > --files file:/home/da_mcom_milan/PredictionIO/conf/log4j. > properties,file:/home/da_mcom_milan/PredictionIO/vendors/hbase/conf/hbase-site.xml > --driver-class-path /home/da_mcom_milan/PredictionIO/conf:/home/da_ > mcom_milan/PredictionIO/vendors/hbase/conf file:/home/da_mcom_milan/ > PredictionIO/lib/pio-assembly-0.10.0-incubating.jar --engine-id > 7mVUx7nKCRXWPHAdk46GQOJRtH6VDnqA --engine-version > dc0573e7ddab8588f6ae287d7386c2d6827fec86 --engine-variant > file:/home/da_mcom_milan/PredictionIO/personalized-complementary/engine.json > --verbosity 0 --json-extractor Both --env PIO_STORAGE_SOURCES_HBASE_ > TYPE=hbase,PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_ > METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/home/da_mcom_milan/. > pio_store,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=mdc2vra176,PIO_STORAGE_ > SOURCES_HBASE_HOME=/home/da_mcom_milan/PredictionIO/ > vendors/hbase,PIO_HOME=/home/da_mcom_milan/PredictionIO, > PIO_FS_ENGINESDIR=/home/da_mcom_milan/.pio_store/engines, > PIO_STORAGE_SOURCES_LOCALFS_PATH=/home/da_mcom_milan/.pio_ > store/models,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE= > elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE= > ELASTICSEARCH,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE= > LOCALFS,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME= > pio_event,PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME= > pros-prod,PIO_FS_TMPDIR=/home/da_mcom_milan/.pio_store/tmp, > PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_ > STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE,PIO_ > CONF_DIR=/home/da_mcom_milan/PredictionIO/conf,PIO_STORAGE_ > SOURCES_ELASTICSEARCH_PORTS=9300,PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs > > 17/03/31 13:29:00 WARN NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > > 17/03/31 13:29:00 INFO TimelineClientImpl: Timeline service address: > http://mdc2vra180.federated.fds:8188/ws/v1/timeline/ > > 17/03/31 13:29:00 INFO RMProxy: Connecting to ResourceManager at > mdc2vra180.federated.fds/11.126.100.180:8050 > > 17/03/31 13:29:00 INFO AHSProxy: Connecting to Application History server > at mdc2vra180.federated.fds/11.126.100.180:10200 > > 17/03/31 13:29:01 WARN DomainSocketFactory: The short-circuit local reads > feature cannot be used because libhadoop cannot be loaded. > > 17/03/31 13:29:01 INFO Client: Requesting a new application from cluster > with 8 NodeManagers > > 17/03/31 13:29:01 INFO Client: Verifying our application has not requested > more than the maximum memory capability of the cluster (47104 MB per > container) > > 17/03/31 13:29:01 INFO Client: Will allocate AM container, with 1408 MB > memory including 384 MB overhead > > 17/03/31 13:29:01 INFO Client: Setting up container launch context for our > AM > > 17/03/31 13:29:01 INFO Client: Setting up the launch environment for our > AM container > > 17/03/31 13:29:01 INFO Client: Using the spark assembly jar on HDFS > because you are using HDP, defaultSparkAssembly:hdfs:// > mdc2vra179.federated.fds:8020/hdp/apps/2.5.3.0-37/spark/ > spark-hdp-assembly.jar > > 17/03/31 13:29:01 INFO Client: Preparing resources for our AM container > > 17/03/31 13:29:01 INFO Client: Using the spark assembly jar on HDFS > because you are using HDP, defaultSparkAssembly:hdfs:// > mdc2vra179.federated.fds:8020/hdp/apps/2.5.3.0-37/spark/ > spark-hdp-assembly.jar > > 17/03/31 13:29:01 INFO Client: Source and destination file systems are the > same. Not copying hdfs://mdc2vra179.federated. > fds:8020/hdp/apps/2.5.3.0-37/spark/spark-hdp-assembly.jar > > 17/03/31 13:29:01 INFO Client: Uploading resource file:/home/da_mcom_milan/ > PredictionIO/lib/pio-assembly-0.10.0-incubating.jar -> > hdfs://mdc2vra179.federated.fds:8020/user/da_mcom_milan/. > sparkStaging/application_1489598450058_0028/pio- > assembly-0.10.0-incubating.jar > > 17/03/31 13:29:02 INFO Client: Uploading resource file:/home/da_mcom_milan/ > PredictionIO/personalized-complementary/target/scala-2. > 10/template-scala-parallel-universal-recommendation-assembly-0.5.0-deps.jar > -> hdfs://mdc2vra179.federated.fds:8020/user/da_mcom_milan/. > sparkStaging/application_1489598450058_0028/template- > scala-parallel-universal-recommendation-assembly-0.5.0-deps.jar > > 17/03/31 13:29:02 INFO Client: Uploading resource file:/home/da_mcom_milan/ > PredictionIO/personalized-complementary/target/scala-2. > 10/template-scala-parallel-universal-recommendation_2.10-0.5.0.jar -> > hdfs://mdc2vra179.federated.fds:8020/user/da_mcom_milan/. > sparkStaging/application_1489598450058_0028/template- > scala-parallel-universal-recommendation_2.10-0.5.0.jar > > 17/03/31 13:29:02 INFO Client: Uploading resource file:/home/da_mcom_milan/ > PredictionIO/conf/log4j.properties -> hdfs://mdc2vra179.federated. > fds:8020/user/da_mcom_milan/.sparkStaging/application_ > 1489598450058_0028/log4j.properties > > 17/03/31 13:29:03 INFO Client: Uploading resource file:/home/da_mcom_milan/ > PredictionIO/vendors/hbase/conf/hbase-site.xml -> > hdfs://mdc2vra179.federated.fds:8020/user/da_mcom_milan/. > sparkStaging/application_1489598450058_0028/hbase-site.xml > > 17/03/31 13:29:03 INFO Client: Uploading resource > file:/tmp/spark-9edc270b-3291-4913-8324-5f9e3ec4810f/__spark_conf__2400158678974980853.zip > -> hdfs://mdc2vra179.federated.fds:8020/user/da_mcom_milan/. > sparkStaging/application_1489598450058_0028/__spark_ > conf__2400158678974980853.zip > > 17/03/31 13:29:03 INFO SecurityManager: Changing view acls to: > da_mcom_milan > > 17/03/31 13:29:03 INFO SecurityManager: Changing modify acls to: > da_mcom_milan > > 17/03/31 13:29:03 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: > Set(da_mcom_milan); users with modify permissions: Set(da_mcom_milan) > > 17/03/31 13:29:04 INFO Client: Submitting application 28 to ResourceManager > > 17/03/31 13:29:04 INFO YarnClientImpl: Submitted application > application_1489598450058_0028 > > 17/03/31 13:29:05 INFO Client: Application report for > application_1489598450058_0028 (state: ACCEPTED) > > 17/03/31 13:29:05 INFO Client: > > client token: N/A > > diagnostics: AM container is launched, waiting for AM container to > Register with RM > > ApplicationMaster host: N/A > > ApplicationMaster RPC port: -1 > > queue: default > > start time: 1490981344043 > > final status: UNDEFINED > > tracking URL: http://mdc2vra180.federated.fds:8088/proxy/application_ > 1489598450058_0028/ > > user: da_mcom_milan > > 17/03/31 13:29:06 INFO Client: Application report for > application_1489598450058_0028 (state: ACCEPTED) > > 17/03/31 13:29:07 INFO Client: Application report for > application_1489598450058_0028 (state: ACCEPTED) > > 17/03/31 13:29:08 INFO Client: Application report for > application_1489598450058_0028 (state: ACCEPTED) > > 17/03/31 13:29:09 INFO Client: Application report for > application_1489598450058_0028 (state: ACCEPTED) > > 17/03/31 13:29:10 INFO Client: Application report for > application_1489598450058_0028 (state: ACCEPTED) > > 17/03/31 13:29:11 INFO Client: Application report for > application_1489598450058_0028 (state: FAILED) > > 17/03/31 13:29:11 INFO Client: > > client token: N/A > > diagnostics: Application application_1489598450058_0028 failed 2 times due > to AM Container for appattempt_1489598450058_0028_000002 exited with > exitCode: -1000 > > For more detailed output, check the application tracking page: > http://mdc2vra180.federated.fds:8088/cluster/app/ > application_1489598450058_0028 Then click on links to logs of each > attempt. > > Diagnostics: File does not exist: hdfs://mdc2vra179.federated. > fds:8020/user/da_mcom_milan/.sparkStaging/application_ > 1489598450058_0028/template-scala-parallel-universal- > recommendation-assembly-0.5.0-deps.jar > > java.io.FileNotFoundException: File does not exist: > hdfs://mdc2vra179.federated.fds:8020/user/da_mcom_milan/. > sparkStaging/application_1489598450058_0028/template- > scala-parallel-universal-recommendation-assembly-0.5.0-deps.jar > > at org.apache.hadoop.hdfs.DistributedFileSystem$25. > doCall(DistributedFileSystem.java:1427) > > at org.apache.hadoop.hdfs.DistributedFileSystem$25. > doCall(DistributedFileSystem.java:1419) > > at org.apache.hadoop.fs.FileSystemLinkResolver.resolve( > FileSystemLinkResolver.java:81) > > at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus( > DistributedFileSystem.java:1419) > > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1724) > > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1142) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:617) > > at java.lang.Thread.run(Thread.java:745) > > > Failing this attempt. Failing the application. > > ApplicationMaster host: N/A > > ApplicationMaster RPC port: -1 > > queue: default > > start time: 1490981344043 > > final status: FAILED > > tracking URL: http://mdc2vra180.federated.fds:8088/cluster/app/ > application_1489598450058_0028 > > user: da_mcom_milan > > Exception in thread "main" org.apache.spark.SparkException: Application > application_1489598450058_0028 finished with failed status > > at org.apache.spark.deploy.yarn.Client.run(Client.scala:1122) > > at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1169) > > at org.apache.spark.deploy.yarn.Client.main(Client.scala) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > > at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$ > deploy$SparkSubmit$$runMain(SparkSubmit.scala:738) > > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) > > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) > > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > 17/03/31 13:29:11 INFO ShutdownHookManager: Shutdown hook called > > 17/03/31 13:29:11 INFO ShutdownHookManager: Deleting directory > /tmp/spark-9edc270b-3291-4913-8324-5f9e3ec4810f > > On Fri, Mar 31, 2017 at 9:22 AM, Donald Szeto <[email protected]> wrote: > >> Can you show the relevant parts from pio.log, please? If you don't care >> about existing log messages, the easiest way would be to delete pio.log >> from where you run the pio command and start fresh. >> >> On Fri, Mar 31, 2017 at 8:46 AM, Malay Tripathi <[email protected] >> > wrote: >> >>> I think it's Yarn based, setup through Ambari. >>> >>> >>> On Mar 31, 2017, at 6:29 AM, Donald Szeto <[email protected]> wrote: >>> >>> Hi Malay, >>> >>> Is your Spark cluster a standalone deployment or based on YARN? >>> >>> Regards, >>> Donald >>> >>> On Thu, Mar 30, 2017 at 11:48 PM Malay Tripathi < >>> [email protected]> wrote: >>> >>>> Hello, >>>> >>>> I am running pio train *on an edge node* of distributed 8 node spark >>>> cluster & 3 node Hbase. >>>> When I run "pio train" the job runs but it runs on local spark & not >>>> submitted to cluster. >>>> If I do "pio train *--master spark://localhost:7077" *or "pio train >>>> *--master >>>> yarn-cluster" *I get below error - >>>> >>>> * File does not exist: >>>> hdfs://mdc2vra179.federated.fds:8020/user/da_mcom_milan/.sparkStaging/application_1489598450058_0024/template-scala-parallel-universal-recommendation-assembly-0.5.0-deps.jar* >>>> >>>> *java.io.FileNotFoundException: File does not exist: >>>> hdfs://mdc2vra179.federated.fds:8020/user/da_mcom_milan/.sparkStaging/application_1489598450058_0024/template-scala-parallel-universal-recommendation-assembly-0.5.0-deps.jar* >>>> >>>> *at >>>> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1427)* >>>> >>>> *at >>>> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1419)* >>>> >>>> >>>> mdc2vra179 is my Hbase cluster node, also running Namenode. Not sure >>>> why my spark expecting a jar file on hbase/Namenode. >>>> *$PIO_HOME/conf/pio-env.sh-* >>>> >>>> SPARK_HOME=$PIO_HOME/vendors/spark >>>> >>>> HBASE_CONF_DIR=$PIO_HOME/vendors/hbase/conf >>>> >>>> PIO_FS_BASEDIR=$HOME/.pio_store >>>> >>>> PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines >>>> >>>> PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp >>>> >>>> PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta >>>> >>>> PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH >>>> >>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event >>>> >>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE >>>> >>>> PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model >>>> >>>> PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS >>>> >>>> PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs >>>> >>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch >>>> >>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=pros-prod >>>> >>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=mdc2vra176 >>>> >>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300 >>>> >>>> PIO_STORAGE_SOURCES_LOCALFS_PATH=$PIO_FS_BASEDIR/models >>>> >>>> PIO_STORAGE_SOURCES_HBASE_TYPE=hbase >>>> >>>> PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase >>>> >>>> >>>> Thanks, >>>> >>>> Malay >>>> >>> >> >
