pgandhi999 commented on a change in pull request #24072: [SPARK-27112] : Create a resource ordering between threads to resolve the deadlocks encountered … URL: https://github.com/apache/spark/pull/24072#discussion_r265776925
########## File path: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala ########## @@ -258,15 +258,20 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp // Make fake resource offers on all executors private def makeOffers() { // Make sure no executor is killed while some task is launching on it - val taskDescs = CoarseGrainedSchedulerBackend.this.synchronized { - // Filter out executors under killing - val activeExecutors = executorDataMap.filterKeys(executorIsAlive) - val workOffers = activeExecutors.map { - case (id, executorData) => - new WorkerOffer(id, executorData.executorHost, executorData.freeCores, - Some(executorData.executorAddress.hostPort)) - }.toIndexedSeq - scheduler.resourceOffers(workOffers) + // SPARK-27112: We need to ensure that there is ordering of lock acquisition + // between TaskSchedulerImpl and CoarseGrainedSchedulerBackend objects in order to fix + // the deadlock issue exposed in SPARK-27112 + val taskDescs = scheduler.synchronized { Review comment: @vanzin So in the code, I came across the following comment, wonder if that answers the `launchTasks` issue. I exactly do not understand the intention of the comment though. ``` // Accessing `executorDataMap` in `DriverEndpoint.receive/receiveAndReply` doesn't need any // protection. But accessing `executorDataMap` out of `DriverEndpoint.receive/receiveAndReply` // must be protected by `CoarseGrainedSchedulerBackend.this`. Besides, `executorDataMap` should // only be modified in `DriverEndpoint.receive/receiveAndReply` with protection by // `CoarseGrainedSchedulerBackend.this`. private val executorDataMap = new HashMap[String, ExecutorData] ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org