liupc opened a new pull request #24167: [SPARK-27214]Upgrading locality level when task set is starving URL: https://github.com/apache/spark/pull/24167 ## What changes were proposed in this pull request? Currently, Spark locality wait mechanism is not friendly for large job, when number of tasks is large(e.g. 10000+)and with a large number of executors(e.g. 2000), executors may be launched on some nodes where the locality is not the best(not the same nodes hold HDFS blocks). There are cases when `TaskSetManager.lastLaunchTime` is refreshed due to finished tasks within `spark.locality.wait` but coming at low rate(e.g. every `spark.locality.wait` seconds a task is finished), so locality level would not be upgraded and lots of pending tasks will wait a long time. In this case, when `spark.dynamicAllocation.enabled=true`, then lots of executors may be removed by Driver due to become idle and finally slow down the job. We encountered this issue in our production spark cluster, it caused lots of resources wasting and slowed down user's application. This PR will optimize this by following formula: Suppose numPendingTasks=10000, localityExecutionGainFactor=0.1, probabilityOfLocalitySchedule=0.5 ``` maxStarvingTimeForTasks = numTasksCanRun * medianOfTaskExecutionTime * localityExecutionGainFactor * probabilityOfLocalitySchedule totalStarvingTime = sum(starvingTimeByTasks) if (totalStarvingTime > maxStarvingTimeForTasks) { upgrading locality level... } ``` ## How was this patch tested? Exist UT & added UT Please review http://spark.apache.org/contributing.html before opening a pull request.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
