Github user mccheah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21067#discussion_r181475185
  
    --- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala
 ---
    @@ -92,6 +93,12 @@ private[spark] class ExecutorPodFactory(
       }
       private val executorLimitCores = 
sparkConf.get(KUBERNETES_EXECUTOR_LIMIT_CORES)
     
    +  // If the driver pod is killed, the new driver pod will try to
    +  // create a new executors with the same name, but it will fail
    +  // and hangs indefinitely because a terminating executors blocks
    +  // the creation of the new ones, so to avoid that apply salt
    +  private val executorNameSalt = 
Random.alphanumeric.take(4).mkString("").toLowerCase
    --- End diff --
    
    Should guarantee uniqueness by using `UUID`. Use labels to make it easy to 
group all executors tied to this specific job.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to