Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/21748
> Can you point in the fork where the submission client is create the
headless service? (just to help me understand the internals)
> Btw If we stick to this manual approach, the need for the manual headless
service should be documented.
@echarles we create the headless service in spark-submit as part of
spark-submit only for cluster mode:
https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/DriverServiceFeatureStep.scala#L69
Note that we only invoke any of the feature steps and the entry point of
`KubernetesClientApplication` if we run in cluster mode. If we run in client
mode, we enter directly into the user's main class, or, the user is in a
process that just created a Spark context from scratch with the right master
URL (i.e. `new SparkContext(k8s://my-master:8443)`). If you wanted to create
the headless service in client mode, you'd have to do it when instantiating the
`KubernetesClusterSchedulerBackend` somewhere, probably before creating the
`ExecutorPodsAllocator` so that you'd set `spark.driver.host` and
`spark.driver.port` properly when telling the created executors where to find
the driver via the `--driver-url` argument.
I've deferred implementing this code path. The documentation for using a
headless service is only suggested, but not mentioned as a hard requirement:
https://github.com/apache/spark/pull/21748/files#diff-b5527f236b253e0d9f5db5164bdb43e9R131.
I didn't put this as a hard requirement because I could imagine some users
wanting to not specifically use a headless service for this; perhaps they want
to use a full Service object and share that service object with ports to be
exposed for other reachable endpoints their pod exposes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]