liupc opened a new pull request #24167: [SPARK-27214]Upgrading locality level 
when task set is starving
URL: https://github.com/apache/spark/pull/24167
 
 
   ## What changes were proposed in this pull request?
   
   Currently, Spark locality wait mechanism is not friendly for large job, when 
number of tasks is large(e.g. 10000+)and with a large number of executors(e.g. 
2000), executors may be launched on some nodes  where the locality is not the 
best(not the same nodes hold HDFS blocks). There are cases when 
`TaskSetManager.lastLaunchTime` is refreshed due to finished tasks within 
`spark.locality.wait` but coming at low rate(e.g. every `spark.locality.wait` 
seconds a task is finished), so locality level would not be upgraded and lots 
of pending tasks will wait a long time. 
   
   In this case, when `spark.dynamicAllocation.enabled=true`, then lots of 
executors may be removed by Driver due to become idle and finally slow down the 
job.
   
   We encountered this issue in our production spark cluster, it caused lots of 
resources wasting and slowed down user's application.
   
   This PR will optimize this by following formula:
   
   Suppose numPendingTasks=10000, localityExecutionGainFactor=0.1, 
probabilityOfLocalitySchedule=0.5
   
   ```
   maxStarvingTimeForTasks = numTasksCanRun * medianOfTaskExecutionTime * 
localityExecutionGainFactor * probabilityOfLocalitySchedule
   
   totalStarvingTime = sum(starvingTimeByTasks)
   
   if (totalStarvingTime > maxStarvingTimeForTasks)
   
   {  upgrading locality level... }
   ```
   
   ## How was this patch tested?
   
   Exist UT & added UT
   
   Please review http://spark.apache.org/contributing.html before opening a 
pull request.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to