[
https://issues.apache.org/jira/browse/OOZIE-3228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449314#comment-16449314
]
Andras Piros commented on OOZIE-3228:
-------------------------------------
[~Tang Yan] the mismatch arises because Oozie always fills in
{{spark.executor.extraClassPath}} and {{spark.driver.extraClassPath}}:
* if there are no user entries inside {{workflow.xml}}, Oozie provides these w/
{{PWD/*}} to ensure all the YARN NodeManager container localized files like
user files, archives, and sharelib JARs are loaded correctly by Spark. This is
the use case you seem to encounter
* if there are some user entries inside {{workflow.xml}}, those are prepended
to {{PWD/*}} to ensure that:
** user classpath entries come first
** also YARN NodeManager container localized files like user files, archives,
and sharelib JARs are loaded correctly by Spark
Please see
[*{{TestSparkArgsExtractor.java}}*|https://github.com/apache/oozie/blob/master/sharelib/spark/src/test/java/org/apache/oozie/action/hadoop/TestSparkArgsExtractor.java]
for details.
Oozie doesn't use and cannot read a Spark configuration file inside
{{/etc/spark/conf}} folder. The only and correct way to have the entries on
your classpath is to have its contents inside {{workflow.xml}} as well:
{noformat}
--conf spark.executor.extraClassPath=/etc/hbase/conf:/etc/hive/conf
--conf spark.driver.extraClassPath=/etc/hbase/conf:/etc/hive/conf
{noformat}
> Oozie Spark Action - the spark job can't load the properties in
> spark-defaults.conf.
> ------------------------------------------------------------------------------------
>
> Key: OOZIE-3228
> URL: https://issues.apache.org/jira/browse/OOZIE-3228
> Project: Oozie
> Issue Type: Bug
> Components: action
> Affects Versions: 4.3.1
> Reporter: Tang Yan
> Priority: Major
>
> When I create a oozie workflow to launch a spark action, the spark job can't
> load the configured properties in spark-defaults.conf. I've configured each
> Nodemanager as the spark gateway role, so the spark-defaults.conf is
> generated in /etc/spark/conf/ on each worker node.
> in spark-defaults.conf some configuration I've set into.
> spark.executor.extraClassPath=/etc/hbase/conf:/etc/hive/conf
> spark.driver.extraClassPath=/etc/hbase/conf:/etc/hive/conf
> But in the Oozie spark job, they're not loaded automatically.
> --conf spark.executor.extraClassPath=$PWD/*
> --conf spark.driver.extraClassPath=$PWD/*
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)