Parth Gandhi created SPARK-26755:
------------------------------------
Summary: Optimize Spark Scheduler to dequeue speculative tasks
more efficiently
Key: SPARK-26755
URL: https://issues.apache.org/jira/browse/SPARK-26755
Project: Spark
Issue Type: Improvement
Components: Scheduler
Affects Versions: 3.0.0
Reporter: Parth Gandhi
Attachments: Screen Shot 2019-01-28 at 11.21.05 AM.png, Screen Shot
2019-01-28 at 11.21.25 AM.png
Currently, Spark Scheduler takes quite some time to dequeue speculative tasks
for larger tasksets within a stage(like 100000 or more) when speculation is
turned on. On further analysis, it was found that the "task-result-getter"
threads remain blocked on one of the dispatcher-event-loop threads holding the
lock on TaskSchedulerImpl object
{code:java}
def resourceOffers(offers: IndexedSeq[WorkerOffer]): Seq[Seq[TaskDescription]]
= synchronized {
{code}
which takes quite some time to execute the method "dequeueSpeculativeTask" in
TaskSetManager.scala, thus, slowing down the overall running time of the spark
job. We were monitoring the time utilization of that lock for the whole
duration of the job and it was close to 50% i.e. the code within the
synchronized block would run for almost half the duration of the entire spark
job. The screenshots of the thread dump have been attached below for reference.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]