Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/21748
> About selecting the pod with labels, another approach I have taken is
simply using the name of the driver pod, a bit like I have done with the
following deployment (so no need to ensure labels - the ports are the ports
assigned by spark that the code can retrieve).
I don't think you can back a service with a selector that's a pod's name,
but someone with more knowledge of the Service API might be able to correct me
here. I was under the impression one had to use labels. In your example, the
service would match any pod with the label key of `run` being equal to
`spark-pod`, which isn't guaranteed to map to a single unique pod. In
spark-submit we set `spark-app-id` to a unique identifier.
> If I compare with yarn-client with all nodes on the same LAN
But if you run a YARN application with the driver not being on the same
network, then the user has to set up their own connectivity. In Kubernetes that
kind of networking setup happens to come up more often, perhaps, but it's not
enough reason to introduce the complexity of it.
Another situation where we want the driver to not be making the headless
service is in a world where the driver shouldn't have permission to create
service objects, but can have permission to create pod objects. Adding a flag
allowing the driver to create the headless service would implicitly change the
required permissions of the application. This is more work to document and more
for the application writer to consider.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]