Ajith S created SPARK-23626:
-------------------------------

             Summary: Spark DAGScheduler scheduling performance hindered on 
JobSubmitted Event
                 Key: SPARK-23626
                 URL: https://issues.apache.org/jira/browse/SPARK-23626
             Project: Spark
          Issue Type: Bug
          Components: Scheduler
    Affects Versions: 2.2.1
            Reporter: Ajith S


DAGScheduler becomes a bottleneck in cluster when multiple JobSubmitted events 
has to be processed as DAGSchedulerEventProcessLoop is single threaded and it 
will block other tasks in queue like TaskCompletion.

The JobSubmitted event is time consuming depending on the nature of the job 
(Example: calculating parent stage dependencies, shuffle dependencies, 
partitions) and thus it blocks all the events to be processed.

 

I see multiple JIRA referring to this behavior

https://issues.apache.org/jira/browse/SPARK-2647

https://issues.apache.org/jira/browse/SPARK-4961

 

Similarly in my cluster some jobs partition calculation is time consuming 
(Similar to stack at SPARK-2647) hence it slows down the spark 
DAGSchedulerEventProcessLoop which results in user jobs to slowdown, even if 
its tasks are finished within seconds, as TaskCompletion Events are processed 
at a slower rate due to blockage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to