Matt Cheah created SPARK-24655:
----------------------------------
Summary: [K8S] Custom Docker Image Expectations and Documentation
Key: SPARK-24655
URL: https://issues.apache.org/jira/browse/SPARK-24655
Project: Spark
Issue Type: Improvement
Components: Kubernetes
Affects Versions: 2.3.1
Reporter: Matt Cheah
A common use case we want to support with Kubernetes is the usage of custom
Docker images. Some examples include:
* A user builds an application using Gradle or Maven, using Spark as a
compile-time dependency. The application's jars (both the custom-written jars
and the dependencies) need to be packaged in a docker image that can be run via
spark-submit.
* A user builds a PySpark or R application and desires to include custom
dependencies
* A user wants to switch the base image from Alpine to CentOS while using
either built-in or custom jars
We currently do not document how these custom Docker images are supposed to be
built, nor do we guarantee stability of these Docker images with various
spark-submit versions. To illustrate how this can break down, suppose for
example we decide to change the names of environment variables that denote the
driver/executor extra JVM options specified by
{{spark.[driver|executor].extraJavaOptions}}. If we change the environment
variable spark-submit provides then the user must update their custom
Dockerfile and build new images.
Rather than jumping to an implementation immediately though, it's worth taking
a step back and considering these matters from the perspective of the end user.
Towards that end, this ticket will serve as a forum where we can answer at
least the following questions, and any others pertaining to the matter:
# What would be the steps a user would need to take to build a custom Docker
image, given their desire to customize the dependencies and the content (OS or
otherwise) of said images?
# How can we ensure the user does not need to rebuild the image if only the
spark-submit version changes?
The end deliverable for this ticket is a design document, and then we'll create
sub-issues for the technical implementation and documentation of the contract.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]