Repository: spark Updated Branches: refs/heads/master c3f285c93 -> dac099d08
[SPARK-24090][K8S] Update running-on-kubernetes.md ## What changes were proposed in this pull request? Updated documentation for Spark on Kubernetes for the upcoming 2.4.0. Please review http://spark.apache.org/contributing.html before opening a pull request. mccheah erikerlandson Closes #22224 from liyinan926/master. Authored-by: Yinan Li <y...@google.com> Signed-off-by: Sean Owen <sean.o...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dac099d0 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/dac099d0 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/dac099d0 Branch: refs/heads/master Commit: dac099d08251e73b9a658e506ed6802b294ac051 Parents: c3f285c Author: Yinan Li <y...@google.com> Authored: Mon Aug 27 15:55:34 2018 -0500 Committer: Sean Owen <sean.o...@databricks.com> Committed: Mon Aug 27 15:55:34 2018 -0500 ---------------------------------------------------------------------- docs/running-on-kubernetes.md | 40 ++++++++++++++++++++++++++++++-------- 1 file changed, 32 insertions(+), 8 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/dac099d0/docs/running-on-kubernetes.md ---------------------------------------------------------------------- diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md index 8f84ca0..c83dad6 100644 --- a/docs/running-on-kubernetes.md +++ b/docs/running-on-kubernetes.md @@ -185,6 +185,36 @@ To use a secret through an environment variable use the following options to the --conf spark.kubernetes.executor.secretKeyRef.ENV_NAME=name:key ``` +## Using Kubernetes Volumes + +Starting with Spark 2.4.0, users can mount the following types of Kubernetes [volumes](https://kubernetes.io/docs/concepts/storage/volumes/) into the driver and executor pods: +* [hostPath](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath): mounts a file or directory from the host nodeâs filesystem into a pod. +* [emptyDir](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir): an initially empty volume created when a pod is assigned to a node. +* [persistentVolumeClaim](https://kubernetes.io/docs/concepts/storage/volumes/#persistentvolumeclaim): used to mount a `PersistentVolume` into a pod. + +To mount a volume of any of the types above into the driver pod, use the following configuration property: + +``` +--conf spark.kubernetes.driver.volumes.[VolumeType].[VolumeName].mount.path=<mount path> +--conf spark.kubernetes.driver.volumes.[VolumeType].[VolumeName].mount.readOnly=<true|false> +``` + +Specifically, `VolumeType` can be one of the following values: `hostPath`, `emptyDir`, and `persistentVolumeClaim`. `VolumeName` is the name you want to use for the volume under the `volumes` field in the pod specification. + +Each supported type of volumes may have some specific configuration options, which can be specified using configuration properties of the following form: + +``` +spark.kubernetes.driver.volumes.[VolumeType].[VolumeName].options.[OptionName]=<value> +``` + +For example, the claim name of a `persistentVolumeClaim` with volume name `checkpointpvc` can be specified using the following property: + +``` +spark.kubernetes.driver.volumes.persistentVolumeClaim.checkpointpvc.options.claimName=check-point-pvc-claim +``` + +The configuration properties for mounting volumes into the executor pods use prefix `spark.kubernetes.executor.` instead of `spark.kubernetes.driver.`. For a complete list of available options for each supported type of volumes, please refer to the [Spark Properties](#spark-properties) section below. + ## Introspection and Debugging These are the different ways in which you can investigate a running/completed Spark application, monitor progress, and @@ -299,21 +329,15 @@ RBAC authorization and how to configure Kubernetes service accounts for pods, pl ## Future Work -There are several Spark on Kubernetes features that are currently being incubated in a fork - -[apache-spark-on-k8s/spark](https://github.com/apache-spark-on-k8s/spark), which are expected to eventually make it into -future versions of the spark-kubernetes integration. +There are several Spark on Kubernetes features that are currently being worked on or planned to be worked on. Those features are expected to eventually make it into future versions of the spark-kubernetes integration. Some of these include: -* R -* Dynamic Executor Scaling +* Dynamic Resource Allocation and External Shuffle Service * Local File Dependency Management * Spark Application Management * Job Queues and Resource Management -You can refer to the [documentation](https://apache-spark-on-k8s.github.io/userdocs/) if you want to try these features -and provide feedback to the development team. - # Configuration See the [configuration page](configuration.html) for information on Spark configurations. The following configurations are --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org