To add more details to this. When I attempt to execute my training job
using the command 'pio train -- --master yarn' I get the exception that
I've included below. Can anyone tell me how to correctly submit the
training job or what setting I need to change to make this work. I've made
not custom code changes and am simply using PIO 0.12.1 with the
SimilarProduct Recommender.
[ERROR] [SparkContext] Error initializing SparkContext.
[INFO] [ServerConnector] Stopped Spark@1f992a3a{HTTP/1.1}{0.0.0.0:4040}
[WARN] [YarnSchedulerBackend$YarnSchedulerEndpoint] Attempted to request
executors before the AM has registered!
[WARN] [MetricsSystem] Stopping a MetricsSystem that is not running
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at
org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$setEnvFromInputString$1.apply(YarnSparkHadoopUtil.scala:154)
at
org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$setEnvFromInputString$1.apply(YarnSparkHadoopUtil.scala:152)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at
scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at
org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$.setEnvFromInputString(YarnSparkHadoopUtil.scala:152)
at
org.apache.spark.deploy.yarn.Client$$anonfun$setupLaunchEnv$6.apply(Client.scala:819)
at
org.apache.spark.deploy.yarn.Client$$anonfun$setupLaunchEnv$6.apply(Client.scala:817)
at scala.Option.foreach(Option.scala:257)
at
org.apache.spark.deploy.yarn.Client.setupLaunchEnv(Client.scala:817)
at
org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:911)
at
org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:172)
at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at
org.apache.predictionio.workflow.WorkflowContext$.apply(WorkflowContext.scala:45)
at
org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:59)
at
org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:251)
at
org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:751)
at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
On Tue, May 29, 2018 at 12:01 AM, Miller, Clifford <
[email protected]> wrote:
> So updating the version in the RELEASE file to 2.1.1 fixed the version
> detection problem but I'm still not able to submit Spark jobs unless they
> are strictly local. How are you submitting to the HDP Spark?
>
> Thanks,
>
> --Cliff.
>
>
>
> On Mon, May 28, 2018 at 1:12 AM, suyash kharade <[email protected]>
> wrote:
>
>> Hi Miller,
>> I faced same issue.
>> It is giving error as release file has '-' in version
>> Insert simple version in release file something like 2.6.
>>
>> On Mon, May 28, 2018 at 4:32 AM, Miller, Clifford <
>> [email protected]> wrote:
>>
>>> *I've installed an HDP cluster with Hbase and Spark with YARN. As part
>>> of that installation I created some HDP (Ambari) managed clients. I
>>> installed PIO on one of these clients and configured PIO to use the HDP
>>> installed Hadoop, HBase, and Spark. When I run the command 'pio
>>> eventserver &', I get the following error.*
>>>
>>> ####
>>> /home/centos/PredictionIO-0.12.1/bin/semver.sh: line 89: [:
>>> 2.2.6.2.14-5: integer expression expected
>>> /home/centos/PredictionIO-0.12.1/bin/semver.sh: line 93: [[:
>>> 2.2.6.2.14-5: syntax error: invalid arithmetic operator (error token is
>>> ".2.6.2.14-5")
>>> /home/centos/PredictionIO-0.12.1/bin/semver.sh: line 97: [[:
>>> 2.2.6.2.14-5: syntax error: invalid arithmetic operator (error token is
>>> ".2.6.2.14-5")
>>> You have Apache Spark 2.1.1.2.6.2.14-5 at /usr/hdp/2.6.2.14-5/spark2/
>>> which does not meet the minimum version requirement of 1.3.0.
>>> Aborting.
>>>
>>> ####
>>>
>>> *If I then go to /usr/hdp/2.6.2.14-5/spark2/ and replace the RELEASE
>>> with an empty file, I can then start the Eventserver, which gives me the
>>> following message:*
>>>
>>> ###
>>> /usr/hdp/2.6.2.14-5/spark2/ contains an empty RELEASE file. This is a
>>> known problem with certain vendors (e.g. Cloudera). Please make sure you
>>> are using at least 1.3.0.
>>> [INFO] [Management$] Creating Event Server at 0.0.0.0:7070
>>> [WARN] [DomainSocketFactory] The short-circuit local reads feature
>>> cannot be used because libhadoop cannot be loaded.
>>> [INFO] [HttpListener] Bound to /0.0.0.0:7070
>>> [INFO] [EventServerActor] Bound received. EventServer is ready.
>>> ####
>>>
>>> *I can then send events to the Eventserver. After sending the events
>>> listed in the SimilarProduct Recommender example I am unable to train.
>>> Using the cluster. If I use 'pio train' then it successfully trains
>>> locally. If I atttempt to use the command "pio train -- --master yarn"
>>> then I get the following:*
>>>
>>> #######
>>> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
>>> at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$se
>>> tEnvFromInputString$1.apply(YarnSparkHadoopUtil.scala:154)
>>> at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$se
>>> tEnvFromInputString$1.apply(YarnSparkHadoopUtil.scala:152)
>>> at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSe
>>> qOptimized.scala:33)
>>> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.sca
>>> la:186)
>>> at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$.setEnvFrom
>>> InputString(YarnSparkHadoopUtil.scala:152)
>>> at org.apache.spark.deploy.yarn.Client$$anonfun$setupLaunchEnv$
>>> 6.apply(Client.scala:819)
>>> at org.apache.spark.deploy.yarn.Client$$anonfun$setupLaunchEnv$
>>> 6.apply(Client.scala:817)
>>> at scala.Option.foreach(Option.scala:257)
>>> at org.apache.spark.deploy.yarn.Client.setupLaunchEnv(Client.sc
>>> ala:817)
>>> at org.apache.spark.deploy.yarn.Client.createContainerLaunchCon
>>> text(Client.scala:911)
>>> at org.apache.spark.deploy.yarn.Client.submitApplication(Client
>>> .scala:172)
>>> at org.apache.spark.scheduler.cluster.YarnClientSchedulerBacken
>>> d.start(YarnClientSchedulerBackend.scala:56)
>>> at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSched
>>> ulerImpl.scala:156)
>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
>>> at org.apache.predictionio.workflow.WorkflowContext$.apply(Work
>>> flowContext.scala:45)
>>> at org.apache.predictionio.workflow.CoreWorkflow$.runTrain(Core
>>> Workflow.scala:59)
>>> at org.apache.predictionio.workflow.CreateWorkflow$.main(Create
>>> Workflow.scala:251)
>>> at org.apache.predictionio.workflow.CreateWorkflow.main(CreateW
>>> orkflow.scala)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:62)
>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy
>>> $SparkSubmit$$runMain(SparkSubmit.scala:751)
>>> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit
>>> .scala:187)
>>> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scal
>>> a:212)
>>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:
>>> 126)
>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>
>>> ########
>>>
>>> *What is the correct way to get PIO to use the YARN based Spark for
>>> training?*
>>>
>>> *Thanks,*
>>>
>>> *--Cliff.*
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Regards,
>> Suyash K
>>
>
>
>
>
>