[
https://issues.apache.org/jira/browse/OOZIE-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162682#comment-16162682
]
Sergey Zhemzhitsky commented on OOZIE-2812:
-------------------------------------------
Hello guys,
I'm wondering whether
*oozie.service.SparkConfigurationService.spark.configurations* configuration
option is really necessary?
Here are just my two cents regarding this not so obvious option
# Spark jobs are just yarn applications as well as flink applications, as well
as java applications that can run on yarn, etc.
There are may be multiple Spark jobs which use completely different spark
versions (for example custom patched ones), so it is not necessary to create
indirect reference between yarn resource manager and spark configuration,
although for map-reduce the similar
*oozie.service.HadoopAccessorService.hadoop.configurations* option makes sense,
because currently it's hardly possible to run multiple map-reduce
implementations on top of single yarn resource manager.
# *oozie.service.SparkConfigurationService.spark.configurations* option reads
*spark-defaults.properties* from the default location (i.e. */etc/spark/conf*)
into java.util.Properties and then just appends these ones to the *spark-opts*
as if these options were specified by means of *--conf* command line options.
But, usually, when using *spark\-submit.sh*, *\-\-conf* command line options
have precedence over properties specified in *spark\-defaults.properties* and
there is a chance that options provided by the user by means of *--conf* spark
option will be overriden by properties provided in
*oozie.service.SparkConfigurationService.spark.configurations*.
# Latest implementation of spark action already supports *<file ...>* element
and *spark\-defaults.properties* in current working directory, and these two
possibilities give more flexibility than
*oozie.service.SparkConfigurationService.spark.configurations*, because user can
## add *spark-default.properties* to the workflow application and oozie will
pick it up
## add
{code}
<file>local:/etc/spark/conf/spark-defaults.properties</file>
{code}
and oozie's spark action will automatically pick up this spark-defaults by
means of *\-\-properties\-file* option preserving semantics of precedence of
properties
** provided by means of *spark\-defaults.properties* file,
** provided by means of *\-\-properties\-file* command line option
** provided by means of *\-\-conf* command line options
> SparkConfigurationService should support loading configurations from multiple
> Spark versions
> --------------------------------------------------------------------------------------------
>
> Key: OOZIE-2812
> URL: https://issues.apache.org/jira/browse/OOZIE-2812
> Project: Oozie
> Issue Type: Improvement
> Reporter: Peter Cseh
> Assignee: Peter Cseh
> Attachments: OOZIE-2812.001.patch, OOZIE-2812.002.patch,
> OOZIE-2812.003.patch, OOZIE-2812.004.patch
>
>
> Right now SparkConfigruationService serves one Spark configuration set by
> {{oozie.service.SparkConfigurationService.spark.configurations}}
> We cloud improve this to support more versions depending on the name of the
> sharelib.
> E.g. the property could change to
> oozie.service.SparkConfigurationService.<sharelib_name>.configurations
> This would be backward compatible as the name for the default Spark sharelib
> is spark while it would be possible to add a sharelib named spark2 or
> spark2.1 and define itheir configuration via
> oozie.service.SparkConfigurationService.spark2.configurations and
> oozie.service.SparkConfigurationService.spark2.1.configurations.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)