dhruve commented on issue #24035: [SPARK-27112] : Spark Scheduler encounters 
two independent Deadlocks …
URL: https://github.com/apache/spark/pull/24035#issuecomment-472135426
 
 
   I think if we fix the lock ordering for the involved threads, this will 
solve the issue.
   
   The current order in which locks are being acquired for individual threads 
is:
   
   TaskResultGetter Order:
   - Lock YarnClusterScheduler
   - Lock CoarseGrainedSchedulerBackend
   
   DispatcherEventLoop Order:
   - Lock CoarseGrainedSchedulerBackend
   - Lock YarnClusterScheduler
   
   SparkDynamicExecutorAllocation Order:
   - Lock ExecutorAllocationManager
   - Lock CoarseGrainedSchedulerBackend
   - Lock TaskSchedulerImpl/YarnClusterScheduler 
   
   Solution:
   The methods which are resulting in the deadlock are from activity in the 
CoarseGrainedSchedulerBackend.
   
   1. KillExecutors: The only check which requires the lock on TSI/YCS is to 
check if the executor is busy or not. We can bump up the check for idle 
executors before synchronizing on CGSB. This will fix the lock order for the 
dynamic allocation thread.
   
   2. MakeOffers: This currently acquires the lock on CGSB to ensure executors 
are not killed while a task is being offered on them. And eventually makes the 
`resourceOffer` on the scheduler which is where it acquires the second lock. I 
agree with @attilapiros suggestion here to fix the second lock ordering issue 
by synchronizing on the scheduler first and then the backend.
   
   These 2 changes should align the ordering sequence and seem to be simple to 
reason about. I think this should solve the issue, but it would be good to have 
more contributors eyeball this change.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to