dongjoon-hyun commented on a change in pull request #32752:
URL: https://github.com/apache/spark/pull/32752#discussion_r644431423



##########
File path: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala
##########
@@ -99,6 +102,16 @@ private[spark] class ExecutorPodsAllocator(
   @volatile private var deletedExecutorIds = Set.empty[Long]
 
   def start(applicationId: String, schedulerBackend: 
KubernetesClusterSchedulerBackend): Unit = {
+    // wait until the driver pod is ready to ensure executors can connect to 
driver svc

Review comment:
       Can we be more specific? The problem is the absence of K8s's headless 
service resource for this driver pod. For example, since K8s is asynchronously 
working, the problem can happen even when the driver pod is ready with all 
sidekicks and the K8s service is not ready to work with this driver pod.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to