Github user mccheah commented on a diff in the pull request:
https://github.com/apache/spark/pull/21748#discussion_r205178092
--- Diff: docs/running-on-kubernetes.md ---
@@ -117,6 +117,45 @@ If the local proxy is running at localhost:8001,
`--master k8s://http://127.0.0.
spark-submit. Finally, notice that in the above example we specify a jar
with a specific URI with a scheme of `local://`.
This URI is the location of the example jar that is already in the Docker
image.
+## Client Mode
+
+Starting with Spark 2.4.0, it is possible to run Spark applications on
Kubernetes in client mode. When your application
+runs in client mode, the driver can run inside a pod or on a physical
host. When running an application in client mode,
+it is recommended to account for the following factors:
+
+### Client Mode Networking
+
+Spark executors must be able to connect to the Spark driver over a
hostname and a port that is routable from the Spark
+executors. The specific network configuration that will be required for
Spark to work in client mode will vary per
+setup. If you run your driver inside a Kubernetes pod, you can use a
+[headless
service](https://kubernetes.io/docs/concepts/services-networking/service/#headless-services)
to allow your
+driver pod to be routable from the executors by a stable hostname. When
deploying your headless service, ensure that
+the service's label selector will only match the driver pod and no other
pods; it is recommended to assign your driver
+pod a sufficiently unique label and to use that label in the label
selector of the headless service. Specify the driver's
+hostname via `spark.driver.host` and your spark driver's port to
`spark.driver.port`.
+
+### Client Mode Executor Pod Garbage Collection
+
+If you run your Spark driver in a pod, it is highly recommended to set
`spark.driver.pod.name` to the name of that pod.
+When this property is set, the Spark scheduler will deploy the executor
pods with an
+[OwnerReference](https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/),
which in turn will
+ensure that once the driver pod is deleted from the cluster, all of the
application's executor pods will also be deleted.
+The driver will look for a pod with the given name in the namespace
specified by `spark.kubernetes.namespace`, and
+an OwnerReference pointing to that pod will be added to each executor
pod's OwnerReferences list. Be careful to avoid
+setting the OwnerReference to a pod that is not actually that driver pod,
or else the executors may be terminated
+prematurely when the wrong pod is deleted.
+
+If your application is not running inside a pod, or if
`spark.driver.pod.name` is not set when your application is
+actually running in a pod, keep in mind that the executor pods may not be
properly deleted from the cluster when the
+application exits. The Spark scheduler attempts to delete these pods, but
if the network request to the API server fails
+for any reason, these pods will remain in the cluster. The executor
processes should exit when they cannot reach the
+driver, so the executor pods should not consume compute resources (cpu and
memory) in the cluster after your application
--- End diff --
Unclear, it triggers in the `onDisconnected` event so I think there's a
persistent socket connection that's dropped that causes the exit. So, it should
more or less be instantaneous.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]