[
https://issues.apache.org/jira/browse/SPARK-30985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Prashant Sharma reassigned SPARK-30985:
---------------------------------------
Assignee: (was: Prashant Sharma)
> Propagate SPARK_CONF_DIR files to driver and exec pods.
> -------------------------------------------------------
>
> Key: SPARK-30985
> URL: https://issues.apache.org/jira/browse/SPARK-30985
> Project: Spark
> Issue Type: Improvement
> Components: Kubernetes
> Affects Versions: 3.0.0
> Reporter: Prashant Sharma
> Priority: Major
>
> SPARK_CONF_DIR hosts configuration files like,
> 1) spark-defaults.conf - containing all the spark properties.
> 2) log4j.properties - Logger configuration.
> 3) spark-env.sh - Environment variables to be setup at driver and executor.
> 4) core-site.xml - Hadoop related configuration.
> 5) fairscheduler.xml - Spark's fair scheduling policy at the job level.
> 6) metrics.properties - Spark metrics.
> 7) Any user specific - library or framework specific configuration file.
> Traditionally, SPARK_CONF_DIR has been the home to all user specific
> configuration files and the default behaviour in the Yarn or standalone mode
> is that these configuration files are copied to the worker nodes as required
> by the users themselves. In other words, they are not auto-copied.
> But, in the case of spark on kubernetes, we use spark images and generally
> these images are approved or undergoe some kind of standardisation. These
> files cannot be simply copied to the SPARK_CONF_DIR of the running executor
> and driver pods by the user.
> So, at the moment we have special casing for providing each configuration and
> for any other user specific configuration files, the process is more complex,
> i.e. - e.g. one can start with their own custom image of spark with
> configuration files pre installed etc..
> Examples of special casing are:
> 1. Hadoop configuration in spark.kubernetes.hadoop.configMapName
> 2. Spark-env.sh as in spark.kubernetes.driverEnv.[EnvironmentVariableName]
> 3. Log4j.properties as in https://github.com/apache/spark/pull/26193
> ... And for those such special casing does not exist, they are simply out of
> luck.
> So this feature, will let the user specific configuration files be mounted on
> the driver and executor pods' SPARK_CONF_DIR.
> At the moment it is not clear, if there is a need to, let user specify which
> config files to propagate - to driver and or executor. But, if there is a
> case that feature will be helpful, we can increase the scope of this work or
> create another JIRA issue to track that work.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]