Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/21748
  
    We discussed this offline. After some experimentation, we concluded that 
it's not actually straightforward to set up the headless service in the 
Kubernetes scheduler code in client mode, which would be where we'd have to put 
this code. The problem is that before the scheduler starts up, the driver needs 
to bind to the host given by `spark.driver.host`. But if that hostname is tied 
to a headless service and the headless service does not exist, the hostname 
bind will fail.
    
    There's a few ways to work around this, but all of them seem risky to put 
in a first pass at this patch.
    
    Additionally, it's not clear if the scheduler code should be opinionated 
about configuring network connectivity for the driver. Client mode is just a 
definition of the driver running in a local process - Spark currently doesn't 
make any assumptions about what that local process is, whether it's in a 
Kubernetes pod or not. Contrast this with cluster mode where the driver should 
know it's running in a pod as submitted by `KubernetesClientApplication`, and 
`KubernetesClientApplication` can know to set up its headless service.
    
    Therefore we are not going to have the driver set up the headless service 
in this patch. If we decide later on that creating the headless service is the 
right thing to do, we can introduce that functionality in a separate patch.
    
    I'm going to update the docs and hope to merge this by the end of the day 
on Monday, July 23. Let us know if there are any additional concerns. Thanks 
@liyinan926 @echarles.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to