James,
If you are having multithreaded code in your driver, then you should
allocate multiple cores. In cluster mode you share the node with other
jobs. If you allocate fewer cores than you are using in your driver,
then that node gets over-allocated and you are stealing other
applications' resources. Be nice and limit the parallelism of your
driver and allocate as many spark cores (|spark.driver.cores| see
https://spark.apache.org/docs/latest/configuration.html#application-properties).
Enrico
Am 06.03.20 um 18:36 schrieb James Yu:
Pol, thanks for your reply.
Actually I am running Spark apps in CLUSTER mode. Is what you said
still applicable in cluster mode. Thanks in advance for your further
clarification.
------------------------------------------------------------------------
*From:* Pol Santamaria <p...@qbeast.io>
*Sent:* Friday, March 6, 2020 12:59 AM
*To:* James Yu <ja...@ispot.tv>
*Cc:* user@spark.apache.org <user@spark.apache.org>
*Subject:* Re: Spark driver thread
Hi james,
You can configure the Spark Driver to use more than a single thread.
It is something that depends on the application, but the Spark driver
can take advantage of multiple threads in many situations. For
instance, when the driver program gathers or sends data to the workers.
So yes, if you do computation or I/O on the driver side, you should
explore using multithreads and more than 1 vCPU.
Bests,
Pol Santamaria
On Fri, Mar 6, 2020 at 1:28 AM James Yu <ja...@ispot.tv
<mailto:ja...@ispot.tv>> wrote:
Hi,
Does a Spark driver always works as single threaded?
If yes, does it mean asking for more than one vCPU for the driver
is wasteful?
Thanks,
James