I noticed the appName is different for DataSource (“shop _live”) and Algorithm (“shop_live”). AppNames must match.
Also the eventNames are different, which should be ok but it’s still a question. Why input something that is not used? Given the meaning of the events, I’d use them all for recommendations but you may eventually want to create shopping cart and wishlist models separately since this will yield “complimentary purchases” and “things you may be missing” in the wishlist. From: Wojciech Kowalski <wojci...@tomandco.co.uk> <wojci...@tomandco.co.uk> Reply: user@predictionio.apache.org <user@predictionio.apache.org> <user@predictionio.apache.org> Date: May 23, 2018 at 5:17:06 AM To: Ambuj Sharma <am...@getamplify.com> <am...@getamplify.com>, user@predictionio.apache.org <user@predictionio.apache.org> <user@predictionio.apache.org> Subject: RE: Problem with training in yarn cluster Hello again, After moving hbase to dataproc cluster from docker ( probs dns/hostname resolution issues ) no more hbase error but still training stops: [INFO] [RecommendationEngine$] _ _ __ __ _ /\ | | (_) | \/ | | / \ ___| |_ _ ___ _ __ | \ / | | / /\ \ / __| __| |/ _ \| '_ \| |\/| | | / ____ \ (__| |_| | (_) | | | | | | | |____ /_/ \_\___|\__|_|\___/|_| |_|_| |_|______| [INFO] [Engine] Extracting datasource params... [INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used. [INFO] [Engine] Datasource params: (,DataSourceParams(shop _live,List(purchase, basket-add, wishlist-add, view),None,None)) [INFO] [Engine] Extracting preparator params... [INFO] [Engine] Preparator params: (,Empty) [INFO] [Engine] Extracting serving params... [INFO] [Engine] Serving params: (,Empty) [INFO] [log] Logging initialized @10046ms [INFO] [Server] jetty-9.2.z-SNAPSHOT [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@7a6f5572{/jobs,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@2679cc20{/jobs/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@489e0d2e{/jobs/job,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@720aa19c{/jobs/job/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@724eae6a{/stages,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@1a3e64cf{/stages/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@2271fddb{/stages/stage,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@550be48{/stages/stage/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@2ea7d76{/stages/pool,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@6b9b69f8{/stages/pool/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@46a9ce75{/storage,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@468b9a16{/storage/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@175b4e7c{/storage/rdd,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@27bf31c6{/storage/rdd/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@2f6d8922{/environment,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@35acfdf3{/environment/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@78496d94{/executors,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@26a6525a{/executors/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@65c1fb35{/executors/threadDump,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@3750c11b{/executors/threadDump/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@4462fa8{/static,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@10e699f8{/,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@7a14c082{/api,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@4bfd8ec2{/jobs/job/kill,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@7ef3c37a{/stages/stage/kill,null,AVAILABLE,@Spark} [INFO] [ServerConnector] Started Spark@6a00b5d1{HTTP/1.1}{0.0.0.0:49349} [INFO] [Server] Started @10430ms [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@379fcbd1{/metrics/json,null,AVAILABLE,@Spark} [WARN] [YarnSchedulerBackend$YarnSchedulerEndpoint] Attempted to request executors before the AM has registered! [INFO] [DataSource] ╔════════════════════════════════════════════════════════════╗ ║ Init DataSource ║ ║ ══════════════════════════════════════════════════════════ ║ ║ App name shop _live ║ ║ Event window None ║ ║ Event names List(purchase, basket-add, wishlist-add, view) ║ ║ Min events per user None ║ ╚════════════════════════════════════════════════════════════╝ [INFO] [URAlgorithm] ╔════════════════════════════════════════════════════════════╗ ║ Init URAlgorithm ║ ║ ══════════════════════════════════════════════════════════ ║ ║ App name shop_live ║ ║ ES index name oburindex ║ ║ ES type name items ║ ║ RecsModel all ║ ║ Event names List(purchase, view) ║ ║ ══════════════════════════════════════════════════════════ ║ ║ Random seed -1931119310 ║ ║ MaxCorrelatorsPerEventType 50 ║ ║ MaxEventsPerEventType 500 ║ ║ BlacklistEvents List(purchase) ║ ║ ══════════════════════════════════════════════════════════ ║ ║ User bias 1.0 ║ ║ Item bias 1.0 ║ ║ Max query events 100 ║ ║ Limit 20 ║ ║ ══════════════════════════════════════════════════════════ ║ ║ Rankings: ║ ║ popular Some(popRank) ║ ╚════════════════════════════════════════════════════════════╝ [INFO] [Engine$] EngineWorkflow.train [INFO] [Engine$] DataSource: com.actionml.DataSource@4953588a [INFO] [Engine$] Preparator: com.actionml.Preparator@715d8f93 [INFO] [Engine$] AlgorithmList: List(com.actionml.URAlgorithm@50c15628) [INFO] [Engine$] Data sanity check is on. [WARN] [ApplicationMaster] Reporter thread fails 1 time(s) in a row. [WARN] [ApplicationMaster] Reporter thread fails 2 time(s) in a row. [WARN] [ApplicationMaster] Reporter thread fails 3 time(s) in a row. [WARN] [ApplicationMaster] Reporter thread fails 4 time(s) in a row. [INFO] [ServerConnector] Stopped Spark@6a00b5d1{HTTP/1.1}{0.0.0.0:0} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@7ef3c37a{/stages/stage/kill,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@4bfd8ec2{/jobs/job/kill,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@7a14c082{/api,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@10e699f8{/,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@4462fa8{/static,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@3750c11b{/executors/threadDump/json,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@65c1fb35{/executors/threadDump,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@26a6525a{/executors/json,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@78496d94{/executors,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@35acfdf3{/environment/json,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@2f6d8922{/environment,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@27bf31c6{/storage/rdd/json,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@175b4e7c{/storage/rdd,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@468b9a16{/storage/json,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@46a9ce75{/storage,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@6b9b69f8{/stages/pool/json,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@2ea7d76{/stages/pool,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@550be48{/stages/stage/json,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@2271fddb{/stages/stage,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@1a3e64cf{/stages/json,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@724eae6a{/stages,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@720aa19c{/jobs/job/json,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@489e0d2e{/jobs/job,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@2679cc20{/jobs/json,null,UNAVAILABLE,@Spark} [INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@7a6f5572{/jobs,null,UNAVAILABLE,@Spark} [ERROR] [LiveListenerBus] SparkListenerBus has already stopped! Dropping event SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo@e1518c9) [ERROR] [LiveListenerBus] SparkListenerBus has already stopped! Dropping event SparkListenerJobEnd(0,1527077245287,JobFailed(org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down)) Also in stderr(?) this: [Stage 0:> (0 + 0) / 5] Yarn app info: *User:* pio <http://pio-cluster-m:8088/cluster/scheduler?openQueues=default> *Name:* org.apache.predictionio.workflow.CreateWorkflow *Application Type:* SPARK *Application Tags:* *Application Priority:* 0 (Higher Integer value indicates higher priority) *YarnApplicationState:* FINISHED *Queue:* default <http://pio-cluster-m:8088/cluster/scheduler?openQueues=default> *FinalStatus Reported by AM:* FAILED *Started:* Wed May 23 12:06:44 +0000 2018 *Elapsed:* 40sec *Tracking URL:* History <http://pio-cluster-m:8088/proxy/application_1526996273517_0030/> *Log Aggregation Status:* DISABLED *Diagnostics:* *Exception was thrown 5 time(s) from Reporter thread.* *Unmanaged Application:* false *Application Node Label expression:* <Not set> *AM container Node Label expression:* <DEFAULT_PARTITION> Thanks, Wojciech *From: *Wojciech Kowalski <wojci...@tomandco.co.uk> *Sent: *23 May 2018 11:26 *To: *Ambuj Sharma <am...@getamplify.com>; user@predictionio.apache.org *Subject: *RE: Problem with training in yarn cluster Hi, Ok so full command now is: pio train --scratch-uri hdfs://pio-cluster-m/pio -- --executor-memory 4g --driver-memory 4g --deploy-mode cluster --master yarn errors stopped after removing –executor-cores 2 --driver-cores 2 I found this error: Uncaught exception: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested virtual cores < 0, or requested virtual cores > max configured, requestedVirtualCores=4, maxVirtualCores=2 But now I have problem with hbase :/ I have hbase host set: declare -x PIO_STORAGE_SOURCES_HBASE_HOSTS="pio-gc" [INFO] [Engine$] EngineWorkflow.train [INFO] [Engine$] DataSource: com.actionml.DataSource@2fdb4e2e [INFO] [Engine$] Preparator: com.actionml.Preparator@d257dd4 [INFO] [Engine$] AlgorithmList: List(com.actionml.URAlgorithm@400bbb7) [INFO] [Engine$] Data sanity check is on. [ERROR] [StorageClient] HBase master is not running (ZooKeeper ensemble: pio-cluster-m). Please make sure that HBase is running properly, and that the configuration is pointing at the correct ZooKeeper ensemble. [ERROR] [Storage$] Error initializing storage client for source HBASE. org.apache.hadoop.hbase.MasterNotRunningException: com.google.protobuf.ServiceException: java.net.UnknownHostException: unknown host: hbase-master at org.apache.hadoop.hbase.client.HConnectionManager$HCoolnnectionImplementation$StubMaker.makeStub(HConnectionManager.java:1645) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(HConnectionManager.java:1671) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getKeepAliveMasterService(HConnectionManager.java:1878) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.isMasterRunning(HConnectionManager.java:894) at org.apache.hadoop.hbase.client.HBaseAdmin.checkHBaseAvailable(HBaseAdmin.java:2366) at org.apache.predictionio.data.storage.hbase.StorageClient.<init>(StorageClient.scala:53) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:252) at org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:283) at org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244) at org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244) at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194) at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80) at org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:244) at org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315) at org.apache.predictionio.data.storage.Storage$.getPDataObject(Storage.scala:364) at org.apache.predictionio.data.storage.Storage$.getPDataObject(Storage.scala:307) at org.apache.predictionio.data.storage.Storage$.getPEvents(Storage.scala:454) at org.apache.predictionio.data.store.PEventStore$.eventsDb$lzycompute(PEventStore.scala:37) at org.apache.predictionio.data.store.PEventStore$.eventsDb(PEventStore.scala:37) at org.apache.predictionio.data.store.PEventStore$.find(PEventStore.scala:73) at com.actionml.DataSource.readTraining(DataSource.scala:76) at com.actionml.DataSource.readTraining(DataSource.scala:48) at org.apache.predictionio.controller.PDataSource.readTrainingBase(PDataSource.scala:40) at org.apache.predictionio.controller.Engine$.train(Engine.scala:642) at org.apache.predictionio.controller.Engine.train(Engine.scala:176) at org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67) at org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:251) at org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637) Caused by: com.google.protobuf.ServiceException: java.net.UnknownHostException: unknown host: hbase-master at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1678) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:42561) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(HConnectionManager.java:1682) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(HConnectionManager.java:1591) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$StubMaker.makeStub(HConnectionManager.java:1617) ... 36 more Caused by: java.net.UnknownHostException: unknown host: hbase-master at org.apache.hadoop.hbase.ipc.RpcClient$Connection.<init>(RpcClient.java:385) at org.apache.hadoop.hbase.ipc.RpcClient.createConnection(RpcClient.java:351) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1530) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) ... 41 more *From: *Ambuj Sharma <am...@getamplify.com> *Sent: *23 May 2018 08:59 *To: *user@predictionio.apache.org *Cc: *Wojciech Kowalski <wojci...@tomandco.co.uk> *Subject: *Re: Problem with training in yarn cluster Hi wojciech, I also faced many problems while setting yarn with PredictionIO. This may be the case where yarn is tyring to findout pio.log file on hdfs cluster. You can try "--master yarn --deploy-mode client ". you need to pass this configuration with pio train e.g., pio train -- --master yarn --deploy-mode client Thanks and Regards Ambuj Sharma Sunrise may late, But Morning is sure..... Team ML Betaout On Wed, May 23, 2018 at 4:53 AM, Pat Ferrel <p...@occamsmachete.com> wrote: Actually you might search the archives for “yarn” because I don’t recall how the setup works off hand. Archives here: https://lists.apache.org/list.html?user@predictionio.apache.org Also check the Spark Yarn requirements and remember that `pio train … -- various Spark params` allows you to pass arbitrary Spark params exactly as you would to spark-submit on the pio command line. The double dash separates PIO and Spark params. From: Pat Ferrel <p...@occamsmachete.com> <p...@occamsmachete.com> Reply: user@predictionio.apache.org <user@predictionio.apache.org> <user@predictionio.apache.org> Date: May 22, 2018 at 4:07:38 PM To: user@predictionio.apache.org <user@predictionio.apache.org> <user@predictionio.apache.org>, Wojciech Kowalski <wojci...@tomandco.co.uk> <wojci...@tomandco.co.uk> Subject: RE: Problem with training in yarn cluster What is the command line for `pio train …` Specifically are you using yarn-cluster mode? This causes the driver code, which is a PIO process, to be executed on an executor. Special setup is required for this. From: Wojciech Kowalski <wojci...@tomandco.co.uk> <wojci...@tomandco.co.uk> Reply: user@predictionio.apache.org <user@predictionio.apache.org> <user@predictionio.apache.org> Date: May 22, 2018 at 2:28:43 PM To: user@predictionio.apache.org <user@predictionio.apache.org> <user@predictionio.apache.org> Subject: RE: Problem with training in yarn cluster Hello, Actually I have another error in logs that is actually preventing train as well: [INFO] [RecommendationEngine$] _ _ __ __ _ /\ | | (_) | \/ | | / \ ___| |_ _ ___ _ __ | \ / | | / /\ \ / __| __| |/ _ \| '_ \| |\/| | | / ____ \ (__| |_| | (_) | | | | | | | |____ /_/ \_\___|\__|_|\___/|_| |_|_| |_|______| [INFO] [Engine] Extracting datasource params... [INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used. [INFO] [Engine] Datasource params: (,DataSourceParams(shop_live,List(purchase, basket-add, wishlist-add, view),None,None)) [INFO] [Engine] Extracting preparator params... [INFO] [Engine] Preparator params: (,Empty) [INFO] [Engine] Extracting serving params... [INFO] [Engine] Serving params: (,Empty) [INFO] [log] Logging initialized @6774ms [INFO] [Server] jetty-9.2.z-SNAPSHOT [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@1798eb08{/jobs,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@47c4c3cd{/jobs/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@3e080dea{/jobs/job,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@c75847b{/jobs/job/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@5ce5ee56{/stages,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@3dde94ac{/stages/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@4347b9a0{/stages/stage,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@63b1bbef{/stages/stage/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@10556e91{/stages/pool,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@5967f3c3{/stages/pool/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@2793dbf6{/storage,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@49936228{/storage/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@7289bc6d{/storage/rdd,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@1496b014{/storage/rdd/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@2de3951b{/environment,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@7f3330ad{/environment/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@40e681f2{/executors,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@61519fea{/executors/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@502b9596{/executors/threadDump,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@367b7166{/executors/threadDump/json,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@42669f4a{/static,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@2f25f623{/,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@23ae4174{/api,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@4e33e426{/jobs/job/kill,null,AVAILABLE,@Spark} [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@38d9ae65{/stages/stage/kill,null,AVAILABLE,@Spark} [INFO] [ServerConnector] Started Spark@17239b3{HTTP/1.1}{0.0.0.0:47948} [INFO] [Server] Started @7040ms [INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@16cffbe4{/metrics/json,null,AVAILABLE,@Spark} [WARN] [YarnSchedulerBackend$YarnSchedulerEndpoint] Attempted to request executors before the AM has registered! [ERROR] [ApplicationMaster] Uncaught exception: Thanks, Wojciech *From: *Wojciech Kowalski <wojci...@tomandco.co.uk> *Sent: *22 May 2018 23:20 *To: *user@predictionio.apache.org *Subject: *Problem with training in yarn cluster Hello, I am trying to setup distributed cluster with separate all services but i have problem while running train: log4j:ERROR setFile(null,true) call failed. java.io.FileNotFoundException: /pio/pio.log (No such file or directory) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.<init>(FileOutputStream.java:213) at java.io.FileOutputStream.<init>(FileOutputStream.java:133) at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842) at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768) at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580) at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526) at org.apache.log4j.LogManager.<clinit>(LogManager.java:127) at org.apache.spark.internal.Logging$class.initializeLogging(Logging.scala:117) at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:102) at org.apache.spark.deploy.yarn.ApplicationMaster$.initializeLogIfNecessary(ApplicationMaster.scala:738) at org.apache.spark.internal.Logging$class.log(Logging.scala:46) at org.apache.spark.deploy.yarn.ApplicationMaster$.log(ApplicationMaster.scala:738) at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:753) at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) setup: hbase Hadoop Hdfs Spark cluster with yarn Training in cluster mode I assume spark worker is trying to save log to /pio/pio.log on worker machine instead of pio host. How can I set pio destination to hdfs path ? Or any other advice ? Thanks, Wojciech