Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19946#discussion_r157120443
--- Diff: docs/running-on-kubernetes.md ---
@@ -0,0 +1,502 @@
+---
+layout: global
+title: Running Spark on Kubernetes
+---
+* This will become a table of contents (this text will be scraped).
+{:toc}
+
+Spark can run on clusters managed by [Kubernetes](https://kubernetes.io).
This feature makes use of native
+Kubernetes scheduler that has been added to Spark.
+
+# Prerequisites
+
+* A runnable distribution of Spark 2.3 or above.
+* A running Kubernetes cluster at version >= 1.6 with access configured to
it using
+[kubectl](https://kubernetes.io/docs/user-guide/prereqs/). If you do not
already have a working Kubernetes cluster,
+you may setup a test cluster on your local machine using
+[minikube](https://kubernetes.io/docs/getting-started-guides/minikube/).
+ * We recommend using the latest release of minikube with the DNS addon
enabled.
+* You must have appropriate permissions to list, create, edit and delete
+[pods](https://kubernetes.io/docs/user-guide/pods/) in your cluster. You
can verify that you can list these resources
+by running `kubectl auth can-i <list|create|edit|delete> pods`.
+ * The service account credentials used by the driver pods must be
allowed to create pods, services and configmaps.
+* You must have [Kubernetes
DNS](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/)
configured in your cluster.
+
+# How it works
+
+<p style="text-align: center;">
+ <img src="img/k8s-cluster-mode.png" title="Spark cluster components"
alt="Spark cluster components" />
+</p>
+
+<code>spark-submit</code> can be directly used to submit a Spark
application to a Kubernetes cluster.
+The submission mechanism works as follows:
+
+* Spark creates a Spark driver running within a [Kubernetes
pod](https://kubernetes.io/docs/concepts/workloads/pods/pod/).
+* The driver creates executors which are also running within Kubernetes
pods and connects to them, and executes application code.
+* When the application completes, the executor pods terminate and are
cleaned up, but the driver pod persists
+logs and remains in "completed" state in the Kubernetes API until it's
eventually garbage collected or manually cleaned up.
+
+Note that in the completed state, the driver pod does *not* use any
computational or memory resources.
+
+The driver and executor pod scheduling is handled by Kubernetes. It will
be possible to affect Kubernetes scheduling
+decisions for driver and executor pods using advanced primitives like
+[node
selectors](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector)
+and [node/pod
affinities](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity)
+in a future release.
+
+# Submitting Applications to Kubernetes
+
+## Docker Images
+
+Kubernetes requires users to supply images that can be deployed into
containers within pods. The images are built to
+be run in a container runtime environment that Kubernetes supports. Docker
is a container runtime environment that is
--- End diff --
Just my two cents, if we could foresee that we may add support to other
containers, then we should consider using `container.image` instead of
`docker.image` at the beginning, because rename a config is also considered a
kind of behavior change, that commonly takes more effort to process. If it is
not that case (we are satisfied to be only supporting docker containers), then
it would be fine to just keep the `docker.image` phrase.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]