Hello again,

After moving hbase to dataproc cluster from docker ( probs dns/hostname 
resolution issues ) no more hbase error but still training stops:

[INFO] [RecommendationEngine$] 

               _   _             __  __ _
     /\       | | (_)           |  \/  | |
    /  \   ___| |_ _  ___  _ __ | \  / | |
   / /\ \ / __| __| |/ _ \| '_ \| |\/| | |
  / ____ \ (__| |_| | (_) | | | | |  | | |____
 /_/    \_\___|\__|_|\___/|_| |_|_|  |_|______|


      
[INFO] [Engine] Extracting datasource params...
[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.
[INFO] [Engine] Datasource params: (,DataSourceParams(shop _live,List(purchase, 
basket-add, wishlist-add, view),None,None))
[INFO] [Engine] Extracting preparator params...
[INFO] [Engine] Preparator params: (,Empty)
[INFO] [Engine] Extracting serving params...
[INFO] [Engine] Serving params: (,Empty)
[INFO] [log] Logging initialized @10046ms
[INFO] [Server] jetty-9.2.z-SNAPSHOT
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@7a6f5572{/jobs,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@2679cc20{/jobs/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@489e0d2e{/jobs/job,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@720aa19c{/jobs/job/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@724eae6a{/stages,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@1a3e64cf{/stages/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@2271fddb{/stages/stage,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@550be48{/stages/stage/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@2ea7d76{/stages/pool,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@6b9b69f8{/stages/pool/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@46a9ce75{/storage,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@468b9a16{/storage/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@175b4e7c{/storage/rdd,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@27bf31c6{/storage/rdd/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@2f6d8922{/environment,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@35acfdf3{/environment/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@78496d94{/executors,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@26a6525a{/executors/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@65c1fb35{/executors/threadDump,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@3750c11b{/executors/threadDump/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@4462fa8{/static,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@10e699f8{/,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@7a14c082{/api,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@4bfd8ec2{/jobs/job/kill,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@7ef3c37a{/stages/stage/kill,null,AVAILABLE,@Spark}
[INFO] [ServerConnector] Started Spark@6a00b5d1{HTTP/1.1}{0.0.0.0:49349}
[INFO] [Server] Started @10430ms
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@379fcbd1{/metrics/json,null,AVAILABLE,@Spark}
[WARN] [YarnSchedulerBackend$YarnSchedulerEndpoint] Attempted to request 
executors before the AM has registered!
[INFO] [DataSource] 
╔════════════════════════════════════════════════════════════╗
║ Init DataSource                                            ║
║ ══════════════════════════════════════════════════════════ ║
║ App name                      shop _live             ║
║ Event window                  None                         ║
║ Event names                   List(purchase, basket-add, wishlist-add, view) ║
║ Min events per user           None                         ║
╚════════════════════════════════════════════════════════════╝

[INFO] [URAlgorithm] 
╔════════════════════════════════════════════════════════════╗
║ Init URAlgorithm                                           ║
║ ══════════════════════════════════════════════════════════ ║
║ App name                      shop_live             ║
║ ES index name                 oburindex                    ║
║ ES type name                  items                        ║
║ RecsModel                     all                          ║
║ Event names                   List(purchase, view)         ║
║ ══════════════════════════════════════════════════════════ ║
║ Random seed                   -1931119310                  ║
║ MaxCorrelatorsPerEventType    50                           ║
║ MaxEventsPerEventType         500                          ║
║ BlacklistEvents               List(purchase)               ║
║ ══════════════════════════════════════════════════════════ ║
║ User bias                     1.0                          ║
║ Item bias                     1.0                          ║
║ Max query events              100                          ║
║ Limit                         20                           ║
║ ══════════════════════════════════════════════════════════ ║
║ Rankings:                                                  ║
║ popular                       Some(popRank)                ║
╚════════════════════════════════════════════════════════════╝

[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: com.actionml.DataSource@4953588a
[INFO] [Engine$] Preparator: com.actionml.Preparator@715d8f93
[INFO] [Engine$] AlgorithmList: List(com.actionml.URAlgorithm@50c15628)
[INFO] [Engine$] Data sanity check is on.
[WARN] [ApplicationMaster] Reporter thread fails 1 time(s) in a row.
[WARN] [ApplicationMaster] Reporter thread fails 2 time(s) in a row.
[WARN] [ApplicationMaster] Reporter thread fails 3 time(s) in a row.
[WARN] [ApplicationMaster] Reporter thread fails 4 time(s) in a row.
[INFO] [ServerConnector] Stopped Spark@6a00b5d1{HTTP/1.1}{0.0.0.0:0}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@7ef3c37a{/stages/stage/kill,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@4bfd8ec2{/jobs/job/kill,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@7a14c082{/api,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@10e699f8{/,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@4462fa8{/static,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@3750c11b{/executors/threadDump/json,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@65c1fb35{/executors/threadDump,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@26a6525a{/executors/json,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@78496d94{/executors,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@35acfdf3{/environment/json,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@2f6d8922{/environment,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@27bf31c6{/storage/rdd/json,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@175b4e7c{/storage/rdd,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@468b9a16{/storage/json,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@46a9ce75{/storage,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@6b9b69f8{/stages/pool/json,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@2ea7d76{/stages/pool,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@550be48{/stages/stage/json,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@2271fddb{/stages/stage,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@1a3e64cf{/stages/json,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@724eae6a{/stages,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@720aa19c{/jobs/job/json,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@489e0d2e{/jobs/job,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@2679cc20{/jobs/json,null,UNAVAILABLE,@Spark}
[INFO] [ContextHandler] Stopped 
o.s.j.s.ServletContextHandler@7a6f5572{/jobs,null,UNAVAILABLE,@Spark}
[ERROR] [LiveListenerBus] SparkListenerBus has already stopped! Dropping event 
SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo@e1518c9)
[ERROR] [LiveListenerBus] SparkListenerBus has already stopped! Dropping event 
SparkListenerJobEnd(0,1527077245287,JobFailed(org.apache.spark.SparkException: 
Job 0 cancelled because SparkContext was shut down))


Also in stderr(?) this:
[Stage 0:>                                                          (0 + 0) / 5]

Yarn app info:
User:
pio
Name:
org.apache.predictionio.workflow.CreateWorkflow
Application Type:
SPARK
Application Tags:

Application Priority:
0 (Higher Integer value indicates higher priority)
YarnApplicationState:
FINISHED
Queue:
default
FinalStatus Reported by AM:
FAILED
Started:
Wed May 23 12:06:44 +0000 2018
Elapsed:
40sec
Tracking URL:
History
Log Aggregation Status:
DISABLED
Diagnostics:
Exception was thrown 5 time(s) from Reporter thread.
Unmanaged Application:
false
Application Node Label expression:
<Not set>
AM container Node Label expression:
<DEFAULT_PARTITION>


Thanks,
Wojciech

From: Wojciech Kowalski
Sent: 23 May 2018 11:26
To: Ambuj Sharma; user@predictionio.apache.org
Subject: RE: Problem with training in yarn cluster

Hi,

Ok so full command now is:
pio train --scratch-uri hdfs://pio-cluster-m/pio -- --executor-memory 4g 
--driver-memory 4g --deploy-mode cluster --master yarn

errors stopped after removing –executor-cores 2 --driver-cores 2
I found this error: Uncaught exception: 
org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
resource request, requested virtual cores < 0, or requested virtual cores > max 
configured, requestedVirtualCores=4, maxVirtualCores=2

But now I have problem with hbase :/

I have hbase host set: 
declare -x PIO_STORAGE_SOURCES_HBASE_HOSTS="pio-gc"

[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: com.actionml.DataSource@2fdb4e2e
[INFO] [Engine$] Preparator: com.actionml.Preparator@d257dd4
[INFO] [Engine$] AlgorithmList: List(com.actionml.URAlgorithm@400bbb7)
[INFO] [Engine$] Data sanity check is on.
[ERROR] [StorageClient] HBase master is not running (ZooKeeper ensemble: 
pio-cluster-m). Please make sure that HBase is running properly, and that the 
configuration is pointing at the correct ZooKeeper ensemble.
[ERROR] [Storage$] Error initializing storage client for source HBASE.
org.apache.hadoop.hbase.MasterNotRunningException: 
com.google.protobuf.ServiceException: java.net.UnknownHostException: unknown 
host: hbase-master
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HCoolnnectionImplementation$StubMaker.makeStub(HConnectionManager.java:1645)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(HConnectionManager.java:1671)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getKeepAliveMasterService(HConnectionManager.java:1878)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.isMasterRunning(HConnectionManager.java:894)
        at 
org.apache.hadoop.hbase.client.HBaseAdmin.checkHBaseAvailable(HBaseAdmin.java:2366)
        at 
org.apache.predictionio.data.storage.hbase.StorageClient.<init>(StorageClient.scala:53)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at 
org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:252)
        at 
org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:283)
        at 
org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
        at 
org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
        at 
scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)
        at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
        at 
org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:244)
        at 
org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
        at 
org.apache.predictionio.data.storage.Storage$.getPDataObject(Storage.scala:364)
        at 
org.apache.predictionio.data.storage.Storage$.getPDataObject(Storage.scala:307)
        at 
org.apache.predictionio.data.storage.Storage$.getPEvents(Storage.scala:454)
        at 
org.apache.predictionio.data.store.PEventStore$.eventsDb$lzycompute(PEventStore.scala:37)
        at 
org.apache.predictionio.data.store.PEventStore$.eventsDb(PEventStore.scala:37)
        at 
org.apache.predictionio.data.store.PEventStore$.find(PEventStore.scala:73)
        at com.actionml.DataSource.readTraining(DataSource.scala:76)
        at com.actionml.DataSource.readTraining(DataSource.scala:48)
        at 
org.apache.predictionio.controller.PDataSource.readTrainingBase(PDataSource.scala:40)
        at org.apache.predictionio.controller.Engine$.train(Engine.scala:642)
        at org.apache.predictionio.controller.Engine.train(Engine.scala:176)
        at 
org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67)
        at 
org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:251)
        at 
org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
Caused by: com.google.protobuf.ServiceException: java.net.UnknownHostException: 
unknown host: hbase-master
        at 
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1678)
        at 
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
        at 
org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:42561)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(HConnectionManager.java:1682)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(HConnectionManager.java:1591)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$StubMaker.makeStub(HConnectionManager.java:1617)
        ... 36 more
Caused by: java.net.UnknownHostException: unknown host: hbase-master
        at 
org.apache.hadoop.hbase.ipc.RpcClient$Connection.<init>(RpcClient.java:385)
        at 
org.apache.hadoop.hbase.ipc.RpcClient.createConnection(RpcClient.java:351)
        at 
org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1530)
        at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
        at 
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
        ... 41 more



From: Ambuj Sharma
Sent: 23 May 2018 08:59
To: user@predictionio.apache.org
Cc: Wojciech Kowalski
Subject: Re: Problem with training in yarn cluster

Hi wojciech,
 I also faced many problems while setting yarn with PredictionIO. This may be 
the case where yarn is tyring to findout pio.log file on hdfs cluster. You can 
try "--master yarn --deploy-mode client ". you need to pass this configuration 
with pio train
e.g., pio train -- --master yarn --deploy-mode client





Thanks and Regards
Ambuj Sharma
Sunrise may late, But Morning is sure.....
Team ML
Betaout

On Wed, May 23, 2018 at 4:53 AM, Pat Ferrel <p...@occamsmachete.com> wrote:
Actually you might search the archives for “yarn” because I don’t recall how 
the setup works off hand.

Archives here: https://lists.apache.org/list.html?user@predictionio.apache.org

Also check the Spark Yarn requirements and remember that `pio train … -- 
various Spark params` allows you to pass arbitrary Spark params exactly as you 
would to spark-submit on the pio command line. The double dash separates PIO 
and Spark params. 


From: Pat Ferrel <p...@occamsmachete.com>
Reply: user@predictionio.apache.org <user@predictionio.apache.org>
Date: May 22, 2018 at 4:07:38 PM
To: user@predictionio.apache.org <user@predictionio.apache.org>, Wojciech 
Kowalski <wojci...@tomandco.co.uk>

Subject:  RE: Problem with training in yarn cluster 

What is the command line for `pio train …` Specifically are you using 
yarn-cluster mode? This causes the driver code, which is a PIO process, to be 
executed on an executor. Special setup is required for this.


From: Wojciech Kowalski <wojci...@tomandco.co.uk>
Reply: user@predictionio.apache.org <user@predictionio.apache.org>
Date: May 22, 2018 at 2:28:43 PM
To: user@predictionio.apache.org <user@predictionio.apache.org>
Subject:  RE: Problem with training in yarn cluster 

Hello,
 
Actually I have another error in logs that is actually preventing train as well:
 
[INFO] [RecommendationEngine$] 
 
               _   _             __  __ _
     /\       | | (_)           |  \/  | |
    /  \   ___| |_ _  ___  _ __ | \  / | |
   / /\ \ / __| __| |/ _ \| '_ \| |\/| | |
  / ____ \ (__| |_| | (_) | | | | |  | | |____
 /_/    \_\___|\__|_|\___/|_| |_|_|  |_|______|
 
 
      
[INFO] [Engine] Extracting datasource params...
[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.
[INFO] [Engine] Datasource params: (,DataSourceParams(shop_live,List(purchase, 
basket-add, wishlist-add, view),None,None))
[INFO] [Engine] Extracting preparator params...
[INFO] [Engine] Preparator params: (,Empty)
[INFO] [Engine] Extracting serving params...
[INFO] [Engine] Serving params: (,Empty)
[INFO] [log] Logging initialized @6774ms
[INFO] [Server] jetty-9.2.z-SNAPSHOT
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@1798eb08{/jobs,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@47c4c3cd{/jobs/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@3e080dea{/jobs/job,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@c75847b{/jobs/job/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@5ce5ee56{/stages,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@3dde94ac{/stages/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@4347b9a0{/stages/stage,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@63b1bbef{/stages/stage/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@10556e91{/stages/pool,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@5967f3c3{/stages/pool/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@2793dbf6{/storage,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@49936228{/storage/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@7289bc6d{/storage/rdd,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@1496b014{/storage/rdd/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@2de3951b{/environment,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@7f3330ad{/environment/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@40e681f2{/executors,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@61519fea{/executors/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@502b9596{/executors/threadDump,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@367b7166{/executors/threadDump/json,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@42669f4a{/static,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@2f25f623{/,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@23ae4174{/api,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@4e33e426{/jobs/job/kill,null,AVAILABLE,@Spark}
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@38d9ae65{/stages/stage/kill,null,AVAILABLE,@Spark}
[INFO] [ServerConnector] Started Spark@17239b3{HTTP/1.1}{0.0.0.0:47948}
[INFO] [Server] Started @7040ms
[INFO] [ContextHandler] Started 
o.s.j.s.ServletContextHandler@16cffbe4{/metrics/json,null,AVAILABLE,@Spark}
[WARN] [YarnSchedulerBackend$YarnSchedulerEndpoint] Attempted to request 
executors before the AM has registered!
[ERROR] [ApplicationMaster] Uncaught exception: 
 
Thanks,
Wojciech
 
From: Wojciech Kowalski
Sent: 22 May 2018 23:20
To: user@predictionio.apache.org
Subject: Problem with training in yarn cluster
 
Hello, I am trying to setup distributed cluster with separate all services but 
i have problem while running train:
 
log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: /pio/pio.log (No such file or directory)
        at java.io.FileOutputStream.open0(Native Method)
        at java.io.FileOutputStream.open(FileOutputStream.java:270)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:133)
        at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
        at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
        at 
org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
        at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
        at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
        at 
org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)
        at 
org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)
        at 
org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648)
        at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514)
        at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)
        at 
org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
        at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
        at 
org.apache.spark.internal.Logging$class.initializeLogging(Logging.scala:117)
        at 
org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:102)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster$.initializeLogIfNecessary(ApplicationMaster.scala:738)
        at org.apache.spark.internal.Logging$class.log(Logging.scala:46)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster$.log(ApplicationMaster.scala:738)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:753)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
 
 
setup:
hbase
Hadoop
Hdfs
Spark cluster with yarn
 
Training in cluster mode
I assume spark worker is trying to save log to /pio/pio.log on worker machine 
instead of pio host. How can I set pio destination to hdfs path ?
 
Or any other advice ?
 
Thanks,
Wojciech
 



Reply via email to