Rob Vesse created SPARK-25262:
---------------------------------
Summary: Make Spark local dir volumes configurable with Spark on
Kubernetes
Key: SPARK-25262
URL: https://issues.apache.org/jira/browse/SPARK-25262
Project: Spark
Issue Type: Improvement
Components: Kubernetes
Affects Versions: 2.3.1, 2.3.0
Reporter: Rob Vesse
As discussed during review of the design document for SPARK-24434 while
providing pod templates will provide more in-depth customisation for Spark on
Kubernetes there are some things that cannot be modified because Spark code
generates pod specs in very specific ways.
The particular issue identified relates to handling on {{spark.local.dirs}}
which is done by {{LocalDirsFeatureStep.scala}}. For each directory specified,
or a single default if no explicit specification, it creates a Kubernetes
{{emptyDir}} volume. As noted in the Kubernetes documentation this will be
backed by the node storage
(https://kubernetes.io/docs/concepts/storage/volumes/#emptydir). In some
compute environments this may be extremely undesirable. For example with
diskless compute resources the node storage will likely be a non-performant
remote mounted disk, often with limited capacity. For such environments it
would likely be better to set {{medium: Memory}} on the volume per the K8S
documentation to use a {{tmpfs}} volume instead.
Another closely related issue is that users might want to use a different
volume type to back the local directories and there is no possibility to do
that.
Pod templates will not really solve either of these issues because Spark is
always going to attempt to generate a new volume for each local directory and
always going to set these as {{emptyDir}}.
Therefore the proposal is to make two changes to {{LocalDirsFeatureStep}}:
* Provide a new config setting to enable using {{tmpfs}} backed {{emptyDir}}
volumes
* Modify the logic to check if there is a volume already defined with the name
and if so skip generating a volume definition for it
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]