[ 
https://issues.apache.org/jira/browse/OOZIE-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162682#comment-16162682
 ] 

Sergey Zhemzhitsky commented on OOZIE-2812:
-------------------------------------------

Hello guys,

I'm wondering whether 
*oozie.service.SparkConfigurationService.spark.configurations* configuration 
option is really necessary?  
Here are just my two cents regarding this not so obvious option

# Spark jobs are just yarn applications as well as flink applications, as well 
as java applications that can run on yarn, etc.
There are may be multiple Spark jobs which use completely different spark 
versions (for example custom patched ones), so it is not necessary to create 
indirect reference between yarn resource manager and spark configuration, 
although for map-reduce the similar 
*oozie.service.HadoopAccessorService.hadoop.configurations* option makes sense, 
because currently it's hardly possible to run multiple map-reduce 
implementations on top of single yarn resource manager.
# *oozie.service.SparkConfigurationService.spark.configurations* option reads 
*spark-defaults.properties* from the default location (i.e. */etc/spark/conf*) 
into java.util.Properties and then just appends these ones to the *spark-opts* 
as if these options were specified by means of *--conf* command line options. 
But, usually, when using *spark\-submit.sh*, *\-\-conf* command line options 
have precedence over properties specified in *spark\-defaults.properties* and 
there is a chance that options provided by the user by means of *--conf* spark 
option will be overriden by properties provided in 
*oozie.service.SparkConfigurationService.spark.configurations*.
# Latest implementation of spark action already supports *<file ...>* element 
and *spark\-defaults.properties* in current working directory, and these two 
possibilities give more flexibility than 
*oozie.service.SparkConfigurationService.spark.configurations*, because user can
## add *spark-default.properties* to the workflow application and oozie will 
pick it up 
## add 
{code}
<file>local:/etc/spark/conf/spark-defaults.properties</file>
{code}
and oozie's spark action will automatically pick up this spark-defaults by 
means of *\-\-properties\-file* option preserving semantics of precedence of 
properties 
** provided by means of *spark\-defaults.properties* file, 
** provided by means of *\-\-properties\-file* command line option
** provided by means of *\-\-conf* command line options

> SparkConfigurationService should support loading configurations from multiple 
> Spark versions
> --------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-2812
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2812
>             Project: Oozie
>          Issue Type: Improvement
>            Reporter: Peter Cseh
>            Assignee: Peter Cseh
>         Attachments: OOZIE-2812.001.patch, OOZIE-2812.002.patch, 
> OOZIE-2812.003.patch, OOZIE-2812.004.patch
>
>
> Right now SparkConfigruationService serves one Spark configuration set by
> {{oozie.service.SparkConfigurationService.spark.configurations}}
> We cloud improve this to support more versions depending on the name of the 
> sharelib.
> E.g. the property could change to
> oozie.service.SparkConfigurationService.<sharelib_name>.configurations
> This would be backward compatible as the name for the default Spark sharelib 
> is spark while it would be possible to add a sharelib named spark2 or 
> spark2.1 and define itheir configuration via 
> oozie.service.SparkConfigurationService.spark2.configurations and
> oozie.service.SparkConfigurationService.spark2.1.configurations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to