RE: Problem with training in yarn cluster

Pat Ferrel Tue, 22 May 2018 16:24:41 -0700

Actually you might search the archives for “yarn” because I don’t recall
how the setup works off hand.


Archives here:
https://lists.apache.org/[email protected]

Also check the Spark Yarn requirements and remember that `pio train … --
various Spark params` allows you to pass arbitrary Spark params exactly as
you would to spark-submit on the pio command line. The double dash
separates PIO and Spark params.


From: Pat Ferrel <[email protected]> <[email protected]>
Reply: [email protected] <[email protected]>
<[email protected]>
Date: May 22, 2018 at 4:07:38 PM
To: [email protected] <[email protected]>
<[email protected]>, Wojciech Kowalski <[email protected]>
<[email protected]>
Subject:  RE: Problem with training in yarn cluster

What is the command line for `pio train …` Specifically are you using
yarn-cluster mode? This causes the driver code, which is a PIO process, to
be executed on an executor. Special setup is required for this.


From: Wojciech Kowalski <[email protected]> <[email protected]>
Reply: [email protected] <[email protected]>
<[email protected]>
Date: May 22, 2018 at 2:28:43 PM
To: [email protected] <[email protected]>
<[email protected]>
Subject:  RE: Problem with training in yarn cluster

Hello,



Actually I have another error in logs that is actually preventing train as
well:



[INFO] [RecommendationEngine$]



               _   _             __  __ _

     /\       | | (_)           |  \/  | |

    /  \   ___| |_ _  ___  _ __ | \  / | |

   / /\ \ / __| __| |/ _ \| '_ \| |\/| | |

  / ____ \ (__| |_| | (_) | | | | |  | | |____

 /_/    \_\___|\__|_|\___/|_| |_|_|  |_|______|







[INFO] [Engine] Extracting datasource params...

[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.

[INFO] [Engine] Datasource params:
(,DataSourceParams(shop_live,List(purchase, basket-add, wishlist-add,
view),None,None))

[INFO] [Engine] Extracting preparator params...

[INFO] [Engine] Preparator params: (,Empty)

[INFO] [Engine] Extracting serving params...

[INFO] [Engine] Serving params: (,Empty)

[INFO] [log] Logging initialized @6774ms

[INFO] [Server] jetty-9.2.z-SNAPSHOT

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@1798eb08{/jobs,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@47c4c3cd{/jobs/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@3e080dea{/jobs/job,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@c75847b{/jobs/job/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@5ce5ee56{/stages,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@3dde94ac{/stages/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@4347b9a0{/stages/stage,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@63b1bbef{/stages/stage/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@10556e91{/stages/pool,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@5967f3c3{/stages/pool/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@2793dbf6{/storage,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@49936228{/storage/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@7289bc6d{/storage/rdd,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@1496b014{/storage/rdd/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@2de3951b{/environment,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@7f3330ad{/environment/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@40e681f2{/executors,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@61519fea{/executors/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@502b9596{/executors/threadDump,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@367b7166{/executors/threadDump/json,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@42669f4a{/static,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@2f25f623{/,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@23ae4174{/api,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@4e33e426{/jobs/job/kill,null,AVAILABLE,@Spark}

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@38d9ae65{/stages/stage/kill,null,AVAILABLE,@Spark}

[INFO] [ServerConnector] Started Spark@17239b3{HTTP/1.1}{0.0.0.0:47948}

[INFO] [Server] Started @7040ms

[INFO] [ContextHandler] Started
o.s.j.s.ServletContextHandler@16cffbe4{/metrics/json,null,AVAILABLE,@Spark}

[WARN] [YarnSchedulerBackend$YarnSchedulerEndpoint] Attempted to
request executors before the AM has registered!

[ERROR] [ApplicationMaster] Uncaught exception:



Thanks,

Wojciech



*From: *Wojciech Kowalski <[email protected]>
*Sent: *22 May 2018 23:20
*To: *[email protected]
*Subject: *Problem with training in yarn cluster



Hello, I am trying to setup distributed cluster with separate all services
but i have problem while running train:



log4j:ERROR setFile(null,true) call failed.

java.io.FileNotFoundException: /pio/pio.log (No such file or directory)

        at java.io.FileOutputStream.open0(Native Method)

        at java.io.FileOutputStream.open(FileOutputStream.java:270)

        at java.io.FileOutputStream.<init>(FileOutputStream.java:213)

        at java.io.FileOutputStream.<init>(FileOutputStream.java:133)

        at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)

        at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)

        at 
org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)

        at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)

        at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)

        at 
org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)

        at 
org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)

        at 
org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648)

        at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514)

        at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)

        at 
org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)

        at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)

        at 
org.apache.spark.internal.Logging$class.initializeLogging(Logging.scala:117)

        at 
org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:102)

        at 
org.apache.spark.deploy.yarn.ApplicationMaster$.initializeLogIfNecessary(ApplicationMaster.scala:738)

        at org.apache.spark.internal.Logging$class.log(Logging.scala:46)

        at 
org.apache.spark.deploy.yarn.ApplicationMaster$.log(ApplicationMaster.scala:738)

        at 
org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:753)

        at 
org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)





setup:

hbase

Hadoop

Hdfs

Spark cluster with yarn



Training in cluster mode

I assume spark worker is trying to save log to /pio/pio.log on worker
machine instead of pio host. How can I set pio destination to hdfs path ?



Or any other advice ?



Thanks,

Wojciech

RE: Problem with training in yarn cluster

Reply via email to