Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19946#discussion_r156497503
--- Diff: docs/running-on-kubernetes.md ---
@@ -0,0 +1,498 @@
+---
+layout: global
+title: Running Spark on Kubernetes
+---
+* This will become a table of contents (this text will be scraped).
+{:toc}
+
+Spark can run on clusters managed by [Kubernetes](https://kubernetes.io).
This feature makes use of the new experimental native
+Kubernetes scheduler that has been added to Spark.
+
+# Prerequisites
+
+* A runnable distribution of Spark 2.3 or above.
+* A running Kubernetes cluster at version >= 1.6 with access configured to
it using
+[kubectl](https://kubernetes.io/docs/user-guide/prereqs/). If you do not
already have a working Kubernetes cluster,
+you may setup a test cluster on your local machine using
+[minikube](https://kubernetes.io/docs/getting-started-guides/minikube/).
+ * We recommend using the latest releases of minikube be updated to the
most recent version with the DNS addon enabled.
+* You must have appropriate permissions to list, create, edit and delete
+[pods](https://kubernetes.io/docs/user-guide/pods/) in your cluster. You
can verify that you can list these resources
+by running `kubectl auth can-i <list|create|edit|delete> pods`.
+ * The service account credentials used by the driver pods must be
allowed to create pods, services and configmaps.
+* You must have [Kubernetes
DNS](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/)
configured in your cluster.
+
+# How it works
+
+<p style="text-align: center;">
+ <img src="img/k8s-cluster-mode.png" title="Spark cluster components"
alt="Spark cluster components" />
+</p>
+
+spark-submit can be directly used to submit a Spark application to a
Kubernetes cluster. The mechanism by which spark-submit happens is as follows:
+
+* Spark creates a spark driver running within a [Kubernetes
pod](https://kubernetes.io/docs/concepts/workloads/pods/pod/).
+* The driver creates executors which are also running within Kubernetes
pods and connects to them, and executes application code.
+* When the application completes, the executor pods terminate and are
cleaned up, but the driver pod persists
+logs and remains in "completed" state in the Kubernetes API till it's
eventually garbage collected or manually cleaned up.
--- End diff --
s/till/until
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]