Re: Failing to deploy spark job in yarn-client/yarn-cluster using Oozie Spark Action

Robert Kanter Tue, 06 Oct 2015 17:15:57 -0700

Hi Nitin,

There’s a few problems with your workflow’s configuration.
oozie.service.SparkConfigurationService.spark.configurations and
oozie.service.WorkflowAppService.system.libpath are Oozie server
configuration properties; they go in oozie-site.xml, not your workflow.


Second, oozie.service.SparkConfigurationService.spark.configurations should
point to a directory with a spark-defaults.conf.  I suggest you take a look
at https://oozie.apache.org/docs/4.2.0/oozie-default.xml

Also, oozie.service.WorkflowAppService.system.libpath should not point to a
subdirectory in the sharelib.  You should probably leave it with it’s
default value of /user/${user.name}/share/lib.  When you use the Spark
action, Oozie knows to include the spark subdirectory.

And oozie.use.system.libpath=true is a job-level property.  It should be
set in your job.properties.

- Robert

On Tue, Oct 6, 2015 at 4:22 PM, Laurent H <[email protected]> wrote:

> Master node in your xml can only have yarn-cluster or local[x] where x is
> the number of executor no ?
>
> --
> Laurent HATIER - Consultant Big Data & Business Intelligence chez CapGemini
> fr.linkedin.com/pub/laurent-hatier/25/36b/a86/
> <http://fr.linkedin.com/pub/laurent-h/25/36b/a86/>
>
> 2015-10-06 9:25 GMT+02:00 Nitin Kumar <[email protected]>:
>
> > Hi,
> >
> > I am running a 3 node cluster (HDP 2.3, installed using ambari 2.1.1).
> > I have been trying to run a spark job that runs a word count program
> using
> > the spark action.
> >
> > It program runs fine when master is set to local but runs into errors
> when
> > set to yarn-cluster or yarn-client.
> >
> > My workflow is as follows
> >
> >
> > <?xml version="1.0" encoding="UTF-8"?>
> > <workflow-app xmlns='uri:oozie:workflow:0.4' name='sparkjob'>
> >     <start to='spark-process' />
> >     <action name='spark-process'>
> >         <spark xmlns='uri:oozie:spark-action:0.1'>
> >         <job-tracker>${jobTracker}</job-tracker>
> >         <name-node>${nameNode}</name-node>
> >         <configuration>
> >             <property>
> >
> > <name>oozie.service.SparkConfigurationService.spark.configurations</name>
> >
> >
> >
> <value>spark.eventLog.dir=hdfs://node1.analytics.subex:8020/user/spark/applicationHistory,spark.yarn.historyServer.address=
> > http://node1.analytics.subex:18088,spark.eventLog.enabled=true</value>
> >             </property>
> >             <!--property>
> >                 <name>oozie.use.system.libpath</name>
> >                 <value>true</value>
> >             </property>
> >             <property>
> >
> > <name>oozie.service.WorkflowAppService.system.libpath</name>
> >
> > <value>/user/oozie/share/lib/lib_20150831190253/spark</value>
> >             </property-->
> >         </configuration>
> >         <master>yarn-client</master>
> >         <mode>client</mode>
> >         <name>Word Count</name>
> >
>  <jar>/usr/hdp/current/spark-client/AnalyticsJar/wordcount.py</jar>
> >         <spark-opts>--executor-memory 1G --driver-memory 1G
> > --executor-cores 4 --num-executors 2 --jars
> > /usr/hdp/current/spark-client/lib/spark-assembly-1.3.1.2.3.0.0-2557
> > -hadoop2.7.1.2.3.0.0-2557.jar</spark-opts>
> >         </spark>
> >         <ok to='end'/>
> >         <error to='spark-fail'/>
> >     </action>
> >     <kill name='spark-fail'>
> >         <message>Spark job failed, error
> > message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> >     </kill>
> >
> >     <end name='end' />
> > </workflow-app>
> >
> >
> > I get the following error:
> >
> > Traceback (most recent call last):
> >   File "/usr/hdp/current/spark-client/AnalyticsJar/wordcount.py", line
> > 26, in <module>
> >     sc = SparkContext(conf=conf)
> >   File
> >
> "/hadoop/yarn/local/filecache/251/spark-core_2.10-1.1.0.jar/pyspark/context.py",
> > line 107, in __init__
> >   File
> >
> "/hadoop/yarn/local/filecache/251/spark-core_2.10-1.1.0.jar/pyspark/context.py",
> > line 155, in _do_init
> >   File
> >
> "/hadoop/yarn/local/filecache/251/spark-core_2.10-1.1.0.jar/pyspark/context.py",
> > line 201, in _initialize_context
> >   File
> >
> "/hadoop/yarn/local/filecache/251/spark-core_2.10-1.1.0.jar/py4j/java_gateway.py",
> > line 701, in __call__
> >   File
> >
> "/hadoop/yarn/local/filecache/251/spark-core_2.10-1.1.0.jar/py4j/protocol.py",
> > line 300, in get_return_value
> > py4j.protocol.Py4JJavaError: An error occurred while calling
> > None.org.apache.spark.api.java.JavaSparkContext.
> > : org.apache.spark.SparkException: YARN mode not available ?
> >         at
> >
> org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:1586)
> >         at org.apache.spark.SparkContext.<init>(SparkContext.scala:310)
> >         at
> >
> org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:53)
> >         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method)
> >         at
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> >         at
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >         at
> java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> >         at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
> >         at
> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> >         at py4j.Gateway.invoke(Gateway.java:214)
> >         at
> >
> py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
> >         at
> > py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
> >         at py4j.GatewayConnection.run(GatewayConnection.java:207)
> >         at java.lang.Thread.run(Thread.java:745)
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.spark.scheduler.cluster.YarnClientClusterScheduler
> >         at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> >         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> >         at java.lang.Class.forName0(Native Method)
> >         at java.lang.Class.forName(Class.java:264)
> >         at
> >
> org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:1580)
> >         ... 13 more
> >
> >
> >
> > The steps I have taken
> >
> > 1. Copied the jars in spark-client/lib directory to
> > /user/oozie/share/lib/spark followed by a restart of the spark service
> > 2. Passed the assembly jar within <spark-opts></spark-opts> (see
> workflow)
> > 3. Tried setting oozie.service.WorkflowAppService.system.libpath to the
> > jars in the share lib directory
> >
> >
> > It seems that spark is not getting the right jars for deploying the job
> in
> > yarn even though I have tried to make the jars available to the workflow.
> > While scanning through the detailed logs, I have also noticed that the
> > assembly jar is present in the yarn application folder and also present
> in
> > oozie classpath.
> >
> > Is there some configuration that I'm missing? Would appreciate any help.
> >
> >
> > Regards,
> > Nitin
> >
>

Re: Failing to deploy spark job in yarn-client/yarn-cluster using Oozie Spark Action

Reply via email to