Github user mccheah commented on a diff in the pull request:
https://github.com/apache/spark/pull/22146#discussion_r229024855
--- Diff: docs/running-on-kubernetes.md ---
@@ -799,4 +815,168 @@ specific to Spark on Kubernetes.
This sets the major Python version of the docker image used to run the
driver and executor containers. Can either be 2 or 3.
</td>
</tr>
+<tr>
+ <td><code>spark.kubernetes.driver.podTemplateFile</code></td>
+ <td>(none)</td>
+ <td>
+ Specify the local file that contains the driver [pod
template](#pod-template). For example
+
<code>spark.kubernetes.driver.podTemplateFile=/path/to/driver-pod-template.yaml`</code>
+ </td>
+</tr>
+<tr>
+ <td><code>spark.kubernetes.executor.podTemplateFile</code></td>
+ <td>(none)</td>
+ <td>
+ Specify the local file that contains the executor [pod
template](#pod-template). For example
+
<code>spark.kubernetes.executor.podTemplateFile=/path/to/executor-pod-template.yaml`</code>
+ </td>
+</tr>
+</table>
+
+#### Pod template properties
+
+See the below table for the full list of pod specifications that will be
overwritten by spark.
+
+### Pod Metadata
+
+<table class="table">
+<tr><th>Pod metadata key</th><th>Modified
value</th><th>Description</th></tr>
+<tr>
+ <td>name</td>
+ <td>Value of <code>spark.kubernetes.driver.pod.name</code></td>
+ <td>
+ The driver pod name will be overwritten with either the configured or
default value of
+ <code>spark.kubernetes.driver.pod.name</code>. The executor pod names
will be unaffected.
+ </td>
+</tr>
+<tr>
+ <td>namespace</td>
+ <td>Value of <code>spark.kubernetes.namespace</code></td>
+ <td>
+ Spark makes strong assumptions about the driver and executor
namespaces. Both driver and executor namespaces will
+ be replaced by either the configured or default spark conf value.
+ </td>
+</tr>
+<tr>
+ <td>labels</td>
+ <td>Adds the labels from
<code>spark.kubernetes.{driver,executor}.label.*</code></td>
+ <td>
+ Spark will add additional labels specified by the spark configuration.
+ </td>
+</tr>
+<tr>
+ <td>annotations</td>
+ <td>Adds the annotations from
<code>spark.kubernetes.{driver,executor}.annotation.*</code></td>
+ <td>
+ Spark will add additional labels specified by the spark configuration.
+ </td>
+</tr>
+</table>
+
+### Pod Spec
+
+<table class="table">
+<tr><th>Pod spec key</th><th>Modified value</th><th>Description</th></tr>
+<tr>
+ <td>imagePullSecrets</td>
+ <td>Adds image pull secrets from
<code>spark.kubernetes.container.image.pullSecrets</code></td>
+ <td>
+ Additional pull secrets will be added from the spark configuration to
both executor pods.
+ </td>
+</tr>
+<tr>
+ <td>nodeSelector</td>
+ <td>Adds node selectors from
<code>spark.kubernetes.node.selector.*</code></td>
+ <td>
+ Additional node selectors will be added from the spark configuration
to both executor pods.
+ </td>
+</tr>
+<tr>
+ <td>restartPolicy</td>
+ <td><code>"never"</code></td>
+ <td>
+ Spark assumes that both drivers and executors never restart.
+ </td>
+</tr>
+<tr>
+ <td>serviceAccount</td>
+ <td>Value of
<code>spark.kubernetes.authenticate.driver.serviceAccountName</code></td>
+ <td>
+ Spark will override <code>serviceAccount</code> with the value of the
spark configuration for only
+ driver pods, and only if the spark configuration is specified.
Executor pods will remain unaffected.
+ </td>
+</tr>
+<tr>
+ <td>serviceAccountName</td>
+ <td>Value of
<code>spark.kubernetes.authenticate.driver.serviceAccountName</code></td>
+ <td>
+ Spark will override <code>serviceAccountName</code> with the value of
the spark configuration for only
+ driver pods, and only if the spark configuration is specified.
Executor pods will remain unaffected.
+ </td>
+</tr>
+<tr>
+ <td>volumes</td>
+ <td>Adds volumes from
<code>spark.kubernetes.{driver,executor}.volumes.[VolumeType].[VolumeName].mount.path</code></td>
+ <td>
+ Spark will add volumes as specified by the spark conf, as well as
additional volumes necessary for passing
--- End diff --
One way we can avoid conflicting volumes entirely is by randomizing the
name of the volumes added by features, e.g. appending some UUID or at least
some large integer. I think keeping running documentation on all volumes we add
from features is too much overhead. If we run into these conflicts often then
we can do this, but I think it's fine not to block merging on that.
Either way though I think again, the validation piece can be done
separately from this PR. I wouldn't consider that documentation as blocking on
this merging. Thoughts?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]