[
https://issues.apache.org/jira/browse/OOZIE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16326347#comment-16326347
]
Attila Sasvari commented on OOZIE-3159:
---------------------------------------
[~satishsaley], [~andras.piros], [~gezapeti] can you take a look at the
attached patch?
Testing done:
- {{mvn clean install assembly:single -DskipTests -Denforcer.skip=true
-Dcheckstyle.skip=true -Dfindbugs.skip=true -DtargetJavaVersion=1.8
-DjavaVersion=1.8 -Dhadoop.version-2.6.0 -Puber}}
- configured Oozie so that is talks with pseudo hadoop 2.6.
- modified {{examples/apps/spark/workflow.xml}} so that it also includes
{{<mode>}}
{code:xml}
<workflow-app xmlns='uri:oozie:workflow:0.5' name='SparkFileCopy'>
<start to='spark-node' />
<action name='spark-node'>
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete
path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark"/>
</prepare>
<master>${master}</master>
<mode>${mode}</mode>
<name>Spark-FileCopy</name>
<class>org.apache.oozie.example.SparkFileCopy</class>
<jar>${nameNode}/user/${wf:user()}/${examplesRoot}/apps/spark/lib/oozie-examples.jar</jar>
<arg>${nameNode}/user/${wf:user()}/${examplesRoot}/input-data/text/data.txt</arg>
<arg>${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark</arg>
</spark>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Workflow failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]
</message>
</kill>
<end name='end' />
</workflow-app>
{code}
- CLUSTER mode with YARN: {{bin/oozie job -oozie http://localhost:11000/oozie
-config examples/apps/pyspark/job.properties -run
-DnameNode=hdfs://localhost:9000 -DjobTracker=localhost:8032 -Dmaster=yarn
-Dmode=cluster}}, workflow succeeded
- CLIENT mode with YARN: {{bin/oozie job -oozie http://localhost:11000/oozie
-config examples/apps/pyspark/job.properties -run
-DnameNode=hdfs://localhost:9000 -DjobTracker=localhost:8032 -Dmaster=yarn
-Dmode=client}}, workflow succeeded
- executed example workflows all except hive related ones, pyspark failed first
because I have not uploaded required dependencies to Spark sharelib. After I
set it up correctly, workflow succeeded.
Note: If you do not specify mode in the Spark workflow when {{master}} set to
{{yarn}}, executor will fail.
> Spark Action fails because of absence of hadoop mapreduce jar(s)
> ----------------------------------------------------------------
>
> Key: OOZIE-3159
> URL: https://issues.apache.org/jira/browse/OOZIE-3159
> Project: Oozie
> Issue Type: Bug
> Reporter: Satish Subhashrao Saley
> Assignee: Attila Sasvari
> Priority: Blocker
> Fix For: 5.0.0b1
>
> Attachments: OOZIE-3159-001.patch
>
>
> OOZIE-2869 removed map reduce dependencies from getting added to Spark
> action. Spark action uses
> org.apache.hadoop.filecache.DistributedCache. It is not available anymore in
> Spack action's classpath, causing it to fail.
> {code}
> java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:412)
> at
> org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:56)
> at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:225)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:219)
> at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:155)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:142)
> Caused by: java.lang.NoClassDefFoundError:
> org/apache/hadoop/filecache/DistributedCache
> at
> org.apache.oozie.action.hadoop.SparkArgsExtractor.extract(SparkArgsExtractor.java:309)
> at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:74)
> at
> org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:101)
> at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
> ... 16 more
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.filecache.DistributedCache
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 20 more
> Failing Oozie Launcher, org/apache/hadoop/filecache/DistributedCache
> java.lang.NoClassDefFoundError: org/apache/hadoop/filecache/DistributedCache
> at
> org.apache.oozie.action.hadoop.SparkArgsExtractor.extract(SparkArgsExtractor.java:309)
> at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:74)
> at
> org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:101)
> at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:412)
> at
> org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:56)
> at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:225)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:219)
> at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:155)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:142)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.filecache.DistributedCache
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 20 more
> Oozie Launcher, uploading action data to HDFS sequence file:
> hdfs://localhost:8020/user/saley/oozie-sale/0000009-180112124633268-oozie-sale-W/spark-node--spark/action-data.seq
> {code}
> I enable adding map reduce jars by setting
> {{oozie.launcher.oozie.action.mapreduce.needed.for}} to {{true}}. The
> launcher job was able to kick the child job. But child job failed with
> {code}
> 2018-01-12 15:00:13,301 [Driver] ERROR
> org.apache.spark.deploy.yarn.ApplicationMaster - User class threw exception:
> java.lang.SecurityException: class "javax.servlet.FilterRegistration"'s
> signer information does not match signer information of other classes in the
> same package
> java.lang.SecurityException: class "javax.servlet.FilterRegistration"'s
> signer information does not match signer information of other classes in the
> same package
> at java.lang.ClassLoader.checkCerts(ClassLoader.java:898)
> at java.lang.ClassLoader.preDefineClass(ClassLoader.java:668)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:761)
> at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
> at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at
> org.spark-project.jetty.servlet.ServletContextHandler.<init>(ServletContextHandler.java:136)
> at
> org.spark-project.jetty.servlet.ServletContextHandler.<init>(ServletContextHandler.java:129)
> at
> org.spark-project.jetty.servlet.ServletContextHandler.<init>(ServletContextHandler.java:98)
> at
> org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:126)
> at
> org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:113)
> at org.apache.spark.ui.WebUI.attachPage(WebUI.scala:78)
> at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:62)
> at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:62)
> at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at org.apache.spark.ui.WebUI.attachTab(WebUI.scala:62)
> at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:63)
> at org.apache.spark.ui.SparkUI.<init>(SparkUI.scala:76)
> at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:195)
> at org.apache.spark.ui.SparkUI$.createLiveUI(SparkUI.scala:146)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:473)
> at
> org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
> at org.apache.oozie.example.SparkFileCopy.main(SparkFileCopy.java:35)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
> 2018-01-12 15:00:13,303 [Driver] INFO
> org.apache.spark.deploy.yarn.ApplicationMaster - Final app status: FAILED,
> exitCode: 15, (reason: User class threw exception:
> java.lang.SecurityException: class "javax.servlet.FilterRegistration"'s
> signer information does not match signer information of other classes in the
> same package)
> {code}
> I looked around this exception, it is due to servlet-api-2.5.jar which got
> pulled in by hadoop-common in the spark sharelib. We need to revisit the
> reason for adding hadoop-common as dependency.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)