[
https://issues.apache.org/jira/browse/SPARK-33418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
dingbei updated SPARK-33418:
----------------------------
Description:
It begins with the needs to start a lot of spark streaming receivers . *The
launch time gets super long when it comes to more than 300 receivers.* I will
show tests data I did and how I improved this.
*Tests preparation*
There are two cores exists in every executors.(one for receiver and the other
one to process every batch of datas). I observed launch time of all receivers
through spark web UI (duration between the first receiver started to the last
one started).
*Tests and data*
At first, we set the number of executors to 200 which means to start 200
receivers and everything goes well. It takes about 50s to launch all receivers.
Then we set the number of executors to 500 which means to start 500 receivers.
The launch time became around 5 mins.
*Dig into souce code*
Then I start to look for the reason in the source code. I use Thread dump to
check which methods takes relatively long time. Then I type logs between these
methods. At last I find that The loop in TaskSchedulerImpl.resourceOffers will
executes more than
was:
It begins with the needs to start a lot of spark streaming receivers . *The
launch time gets super long when it comes to more than 300 receivers.* I will
show tests data I did and how I improved this.
*Tests preparation*
There are two cores exists in every executors.(one for receiver and the other
one to process every batch of datas). I observed launch time of all receivers
through spark web UI (duration between the first receiver started to the last
one started).
*Tests and data*
At first, we set the number of executors to 200 which means to start 200
receivers and everything goes well. It takes about 50s to launch all receivers.
Then we set the number of executors to 500 which means to start 500 receivers.
The launch time became around 5 mins.
*Dig into souce code*
Then I start to look for the reason in the source code. I use Thread dump to
check which methods takes relatively long time. Then I type logs between these
methods. At last I find that The loop in
> TaskSchedulerImpl: Check pending tasks in advance when resource offers
> ----------------------------------------------------------------------
>
> Key: SPARK-33418
> URL: https://issues.apache.org/jira/browse/SPARK-33418
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 3.0.1
> Reporter: dingbei
> Priority: Major
>
> It begins with the needs to start a lot of spark streaming receivers . *The
> launch time gets super long when it comes to more than 300 receivers.* I will
> show tests data I did and how I improved this.
> *Tests preparation*
> There are two cores exists in every executors.(one for receiver and the other
> one to process every batch of datas). I observed launch time of all receivers
> through spark web UI (duration between the first receiver started to the
> last one started).
> *Tests and data*
> At first, we set the number of executors to 200 which means to start 200
> receivers and everything goes well. It takes about 50s to launch all
> receivers.
> Then we set the number of executors to 500 which means to start 500
> receivers. The launch time became around 5 mins.
> *Dig into souce code*
> Then I start to look for the reason in the source code. I use Thread dump to
> check which methods takes relatively long time. Then I type logs between
> these methods. At last I find that The loop in
> TaskSchedulerImpl.resourceOffers will executes more than
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]