Parth Gandhi created SPARK-26755:
------------------------------------

             Summary: Optimize Spark Scheduler to dequeue speculative tasks 
more efficiently
                 Key: SPARK-26755
                 URL: https://issues.apache.org/jira/browse/SPARK-26755
             Project: Spark
          Issue Type: Improvement
          Components: Scheduler
    Affects Versions: 3.0.0
            Reporter: Parth Gandhi
         Attachments: Screen Shot 2019-01-28 at 11.21.05 AM.png, Screen Shot 
2019-01-28 at 11.21.25 AM.png

Currently, Spark Scheduler takes quite some time to dequeue speculative tasks 
for larger tasksets within a stage(like 100000 or more) when speculation is 
turned on. On further analysis, it was found that the "task-result-getter" 
threads remain blocked on one of the dispatcher-event-loop threads holding the 
lock on TaskSchedulerImpl object
{code:java}
def resourceOffers(offers: IndexedSeq[WorkerOffer]): Seq[Seq[TaskDescription]] 
= synchronized {
{code}
which takes quite some time to execute the method  "dequeueSpeculativeTask" in 
TaskSetManager.scala, thus, slowing down the overall running time of the spark 
job. We were monitoring the time utilization of that lock for the whole 
duration of the job and it was close to 50% i.e. the code within the 
synchronized block would run for almost half the duration of the entire spark 
job. The screenshots of the thread dump have been attached below for reference.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to