[jira] [Commented] (OOZIE-2277) Honor oozie.action.sharelib.for.spark in Spark jobs

Robert Kanter (JIRA) Tue, 11 Aug 2015 14:29:46 -0700

    [ 
https://issues.apache.org/jira/browse/OOZIE-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14687396#comment-14687396
 ]


Robert Kanter commented on OOZIE-2277:
--------------------------------------

When I tried to use the {{--jars}} option, it didn't seem to make a difference. 
 With or without it, I get this exception:
{noformat}
Exception in thread "main" java.lang.NoClassDefFoundError: 
com/fasterxml/jackson/databind/Module
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:270)
        at org.apache.spark.util.Utils$.classForName(Utils.scala:191)
        at 
org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:187)
        at 
org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:183)
        at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
        at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
        at 
org.apache.spark.metrics.MetricsSystem.registerSinks(MetricsSystem.scala:183)
        at org.apache.spark.metrics.MetricsSystem.start(MetricsSystem.scala:100)
        at org.apache.spark.SparkEnv$.create(SparkEnv.scala:373)
        at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:216)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:183)
        at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67)
        at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:149)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:250)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.lang.ClassNotFoundException: 
com.fasterxml.jackson.databind.Module
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        ... 24 more
{noformat}
I was talking with [~hshreedharan] and [~vanzin] about this issue and this was 
the only thing that seemed to work.

> Honor oozie.action.sharelib.for.spark in Spark jobs
> ---------------------------------------------------
>
>                 Key: OOZIE-2277
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2277
>             Project: Oozie
>          Issue Type: Improvement
>            Reporter: Ryan Brush
>            Assignee: Robert Kanter
>            Priority: Minor
>         Attachments: OOZIE-2277.001.patch
>
>
> Shared libraries specified by oozie.action.sharelib.for.spark are not visible 
> in the Spark job itself. For instance, setting 
> oozie.action.sharelib.for.spark to "spark,hcat" will not make the hcat jars 
> usable in the Spark job. This is inconsistent with other actions (such as 
> Java and MapReduce actions).
> Since the Spark action just calls SparkSubmit, it looks like we would need to 
> explicitly pass the jars for the specified sharelibs into the SparkSubmit 
> operation so they are available to the Spark operation itself. 
> One option: we can just pass the HDFS URLs to that command via the --jars 
> parameter. This is actually what I've done to work around this issue; it 
> makes for a long SparkSubmit command but works. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (OOZIE-2277) Honor oozie.action.sharelib.for.spark in Spark jobs

Reply via email to