GitHub user rvesse opened a pull request:

    https://github.com/apache/spark/pull/22256

    [SPARK-25262][K8S][WIP] Better support configurability of Spark scratch 
space when using Kubernetes

    ## What changes were proposed in this pull request?
    
    This change improves how Spark on Kubernetes creates the local directories 
used for Spark scratch space i.e. `SPARK_LOCAL_DIRS`/`spark.local.dirs`
    
    Currently Spark on Kubernetes creates each defined local directory, or a 
single default directory if none defined, as a Kubernetes `emptyDir` volume 
mounted into the containers.  The problem is that `emptyDir` directories are 
backed by the node storage and so for some compute environments e.g. diskless 
any "local" storage is actually provided by some remote file system that may 
actually harm performance when jobs use it heavily.
    
    Kubernetes provides the option to have `emptyDir` volumes backed by `tmpfs` 
i.e. RAM on the nodes so we introduce a boolean 
`spark.kubernetes.local.dirs.tmpfs` option that when true causes the created 
`emptyDir` volumes to use memory.
    
    A second related problem is that because Spark on Kubernetes always 
generates `emptyDir` volumes users have no way to use alternative volume types 
that may be available in their cluster.
    
    No new options specific to this problem are introduced but the code is 
modified to detect when the pod spec already defines an appropriately named 
volume and to avoid creating `emptyDir` volumes in this case.  This uses the 
convention of the existing code that volumes for scratch space are named 
`spark-local-dirs-N` numbered from 1-N based on the number of entries defined 
in the `SPARK_LOCAL_DIRS`/`spark.local.dirs` setting.  This is done in 
anticipation of the pod template feature form SPARK-24434 (PR #22146) being 
merged since that will allow users to define custom volumes more easily.
    
    Tasks:
    
    - [x] Support using `tmpfs` volumes
    - [x] Support using pre-existing volumes
    - [ ] Unit tests
    
    ## How was this patch tested?
    
    Unit tests added to the relevant feature step to exercise the new 
configuration option and to check that pre-existing volumes are used.  Plan to 
add further unit tests to check some other corner cases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rvesse/spark SPARK-25262

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22256.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22256
    
----

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to