[ 
https://issues.apache.org/jira/browse/SPARK-25262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606522#comment-16606522
 ] 

Matt Cheah commented on SPARK-25262:
------------------------------------

For [https://github.com/apache/spark/pull/22323] we allow using tmpfs but other 
volume types aren't allowed. Is it fine to close this issue or do we want to 
keep this open to track work to support other volume types there?

> Make Spark local dir volumes configurable with Spark on Kubernetes
> ------------------------------------------------------------------
>
>                 Key: SPARK-25262
>                 URL: https://issues.apache.org/jira/browse/SPARK-25262
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 2.3.0, 2.3.1
>            Reporter: Rob Vesse
>            Priority: Major
>
> As discussed during review of the design document for SPARK-24434 while 
> providing pod templates will provide more in-depth customisation for Spark on 
> Kubernetes there are some things that cannot be modified because Spark code 
> generates pod specs in very specific ways.
> The particular issue identified relates to handling on {{spark.local.dirs}} 
> which is done by {{LocalDirsFeatureStep.scala}}.  For each directory 
> specified, or a single default if no explicit specification, it creates a 
> Kubernetes {{emptyDir}} volume.  As noted in the Kubernetes documentation 
> this will be backed by the node storage 
> (https://kubernetes.io/docs/concepts/storage/volumes/#emptydir).  In some 
> compute environments this may be extremely undesirable.  For example with 
> diskless compute resources the node storage will likely be a non-performant 
> remote mounted disk, often with limited capacity.  For such environments it 
> would likely be better to set {{medium: Memory}} on the volume per the K8S 
> documentation to use a {{tmpfs}} volume instead.
> Another closely related issue is that users might want to use a different 
> volume type to back the local directories and there is no possibility to do 
> that.
> Pod templates will not really solve either of these issues because Spark is 
> always going to attempt to generate a new volume for each local directory and 
> always going to set these as {{emptyDir}}.
> Therefore the proposal is to make two changes to {{LocalDirsFeatureStep}}:
> * Provide a new config setting to enable using {{tmpfs}} backed {{emptyDir}} 
> volumes
> * Modify the logic to check if there is a volume already defined with the 
> name and if so skip generating a volume definition for it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to