Ajith S created SPARK-23626: ------------------------------- Summary: Spark DAGScheduler scheduling performance hindered on JobSubmitted Event Key: SPARK-23626 URL: https://issues.apache.org/jira/browse/SPARK-23626 Project: Spark Issue Type: Bug Components: Scheduler Affects Versions: 2.2.1 Reporter: Ajith S
DAGScheduler becomes a bottleneck in cluster when multiple JobSubmitted events has to be processed as DAGSchedulerEventProcessLoop is single threaded and it will block other tasks in queue like TaskCompletion. The JobSubmitted event is time consuming depending on the nature of the job (Example: calculating parent stage dependencies, shuffle dependencies, partitions) and thus it blocks all the events to be processed. I see multiple JIRA referring to this behavior https://issues.apache.org/jira/browse/SPARK-2647 https://issues.apache.org/jira/browse/SPARK-4961 Similarly in my cluster some jobs partition calculation is time consuming (Similar to stack at SPARK-2647) hence it slows down the spark DAGSchedulerEventProcessLoop which results in user jobs to slowdown, even if its tasks are finished within seconds, as TaskCompletion Events are processed at a slower rate due to blockage. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org