[
https://issues.apache.org/jira/browse/OOZIE-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Kanter updated OOZIE-2277:
---------------------------------
Attachment: OOZIE-2277.001.patch
I couldn't get {{--jars}} to work. I talked to some Spark guys and they said
to use the {{SPARK_DIST_CLASSPATH}} env var. Unfortunately, we can't easily do
that because of how the Launcher Job works. But it turns out we can use
{{spark.executor.extraClassPath}} and {{spark.driver.extraClassPath}} to add
the jars.
The patch sets {{spark.executor.extraClassPath}} and
{{spark.driver.extraClassPath}} (or appends to if already defined) to the
launcher job's classpath. This classpath will have all of the localized jars
that the user added via the sharelib, lib/ dir, etc and anything Oozie or
Hadoop added (basically, the classpath that the other actions normally get).
It also fixes
> Honor oozie.action.sharelib.for.spark in Spark jobs
> ---------------------------------------------------
>
> Key: OOZIE-2277
> URL: https://issues.apache.org/jira/browse/OOZIE-2277
> Project: Oozie
> Issue Type: Improvement
> Reporter: Ryan Brush
> Assignee: Robert Kanter
> Priority: Minor
> Attachments: OOZIE-2277.001.patch
>
>
> Shared libraries specified by oozie.action.sharelib.for.spark are not visible
> in the Spark job itself. For instance, setting
> oozie.action.sharelib.for.spark to "spark,hcat" will not make the hcat jars
> usable in the Spark job. This is inconsistent with other actions (such as
> Java and MapReduce actions).
> Since the Spark action just calls SparkSubmit, it looks like we would need to
> explicitly pass the jars for the specified sharelibs into the SparkSubmit
> operation so they are available to the Spark operation itself.
> One option: we can just pass the HDFS URLs to that command via the --jars
> parameter. This is actually what I've done to work around this issue; it
> makes for a long SparkSubmit command but works.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)