Hi james,
You can configure the Spark Driver to use more than a single thread. It is
something that depends on the application, but the Spark driver can take
advantage of multiple threads in many situations. For instance, when the
driver program gathers or sends data to the workers.
So yes, if
I totally agree with Russell.
In my opinion, the best way is to experiment and take measurements. There
are different chips, some of them have multithreading, some not, also
different system setups... so I'd recommend playing with the
'spark.driver.cores' option.
Bests,
Pol Santamaria
On Fri,
So one thing to know here is that all Java applications are going to use
many threads, and just because your particular main method doesn't spawn
additional threads doesn't mean library you access or use won't spawn
additional threads. The other important note is that Spark doesn't actually
equate
Pol, thanks for your reply.
Actually I am running Spark apps in CLUSTER mode. Is what you said still
applicable in cluster mode. Thanks in advance for your further clarification.
From: Pol Santamaria
Sent: Friday, March 6, 2020 12:59 AM
To: James Yu
Cc:
Srsly?
On Sat, 7 Mar 2020 at 03:28, Koert Kuipers wrote:
> i just ran:
> mvn test -fae > log.txt
>
> at the end of log.txt i find it says there are failures:
> [INFO] Spark Project SQL .. FAILURE [47:55
> min]
>
> that is not very helpful. what tests failed?
>
>
James,
If you are having multithreaded code in your driver, then you should
allocate multiple cores. In cluster mode you share the node with other
jobs. If you allocate fewer cores than you are using in your driver,
then that node gets over-allocated and you are stealing other
applications'
I am trying to write an integration test using Embedded Kafka but I keep
getting NullPointerException. My test case is very simple. It has following
steps:
1. Read a JSON file & write messages to an inputTopic.
2. Perform a 'readStream' operation.
3. Do a 'select' on the Stream. This
i just ran:
mvn test -fae > log.txt
at the end of log.txt i find it says there are failures:
[INFO] Spark Project SQL .. FAILURE [47:55
min]
that is not very helpful. what tests failed?
i could go scroll up but the file has 21,517 lines. ok let's skip that.
so i