[ 
https://issues.apache.org/jira/browse/OOZIE-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344794#comment-16344794
 ] 

Peter Cseh commented on OOZIE-3053:
-----------------------------------

Hey! 
I'd recomment to use the blacklist feature implemented in OOZIE-2811 to remove 
the classpath properties from the Spark action so it would have a proper 
classpath

> Oozie does not honor classpath.txt when running Spark action
> ------------------------------------------------------------
>
>                 Key: OOZIE-3053
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3053
>             Project: Oozie
>          Issue Type: Bug
>          Components: action
>    Affects Versions: 4.2.0, 4.3.0, 5.0.0
>         Environment: # Cloudera Quick Start VM 5.12.1
> # [CDH Oozie 
> 4.1|http://archive.cloudera.com/cdh5/cdh/5/oozie-4.1.0-cdh5.12.1.releasenotes.html]
>  - actually it includes all 4.2, 4.3, 5.0 patches applied
> # Spark 1.6, Spark 2.2
> # /etc/oozie/oozie-site.xml
> ... added
> {code:xml}
> <property>
>   <name>oozie.service.SparkConfigurationService.spark.configurations</name>
>   <value>*=/etc/spark/conf</value>
> </property>
> {code}
> # /etc/spark/conf/spark-defaults.conf
> ... added
> {code}
> spark.hadoop.mapreduce.application.classpath=
> spark.hadoop.yarn.application.classpath=
> {code}
> # workflow.xml
> {code:xml}
> <workflow-app name="My Workflow" xmlns="uri:oozie:workflow:0.5">
>     <start to="spark-0ff5"/>
>     <kill name="Kill">
>         <message>Action failed, error 
> message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>     </kill>
>     <action name="spark-0ff5">
>         <spark xmlns="uri:oozie:spark-action:0.2">
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <master>yarn</master>
>             <mode>cluster</mode>
>             <name>MySpark</name>
>               <class>org.apache.oozie.example.SparkFileCopy</class>
>             <jar>oozie-examples-4.3.0.jar</jar>
>               <arg>/user/cloudera/spark-oozie/input</arg>
>               <arg>/user/cloudera/spark-oozie/output</arg>
>             
> <file>/user/cloudera/spark-oozie-examples/oozie-examples-4.3.0.jar#oozie-examples-4.3.0.jar</file>
>         </spark>
>         <ok to="End"/>
>         <error to="Kill"/>
>     </action>
>     <end name="End"/>
> </workflow-app>
> {code}
>            Reporter: Sergey Zhemzhitsky
>            Priority: Major
>              Labels: oozie, spark, spark2.2
>         Attachments: spark-empty-cp.stderr, spark-empty-cp.stdout, 
> spark-nonempty-cp.stderr, spark-nonempty-cp.stdout
>
>
> Currently CDH distribution of Hadoop configures Spark2 with empty 
> *spark.hadoop.mapreduce.application.classpath*, 
> *spark.hadoop.yarn.application.classpath* properties in 
> */etc/spark2/conf/spark-defaults.properties*
> {code}
> spark.hadoop.mapreduce.application.classpath=
> spark.hadoop.yarn.application.classpath=
> {code}
> The motivation for such a configuration is described in Spark2 Parcel 
> installation scripts 
> (_SPARK2\_ON\_YARN\-2.2.0.cloudera1.jar!/scripts/common.sh_)
> {code}
> # Override the YARN / MR classpath configs since we already include them when 
> generating
> # SPARK_DIST_CLASSPATH. This avoids having the same paths added to the 
> classpath a second
> # time and wasting file descriptors.
> replace_spark_conf "spark.hadoop.mapreduce.application.classpath" "" 
> "$SPARK_DEFAULTS"
> replace_spark_conf "spark.hadoop.yarn.application.classpath" "" 
> "$SPARK_DEFAULTS"
> {code}
> So when configuring Oozie to run with spark-defaults.properties from Spark2, 
> Oozie's actions usually throw an exception
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/conf/Configuration
>       at java.lang.Class.getDeclaredMethods0(Native Method)
>       at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
>       at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
>       at java.lang.Class.getMethod0(Class.java:3018)
>       at java.lang.Class.getMethod(Class.java:1784)
>       at 
> sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
>       at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.conf.Configuration
>       at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}
> The same exception is also thrown in case of Spark 1.6 if there are empty 
> *spark.hadoop.mapreduce.application.classpath*, 
> *spark.hadoop.yarn.application.classpath* in 
> */etc/spark/conf/spark-defaults.properties*
> Probably OOZIE-2547 is the cause of this issue.
> Oozie Launcher logs with empty 
> *spark.hadoop.mapreduce.application.classpath*, 
> *spark.hadoop.yarn.application.classpath* which cause the job to fail are in: 
> [^spark-empty-cp.stderr] and [^spark-empty-cp.stdout]
> Oozie Launcher logs without *spark.hadoop.mapreduce.application.classpath*, 
> *spark.hadoop.yarn.application.classpath* are in: [^spark-nonempty-cp.stderr] 
> and [^spark-nonempty-cp.stdout]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to