Does spark.streaming.concurrentJobs still exist?
spark.streaming.concurrentJobs (default: 1) is the number of concurrent
jobs, i.e. threads in streaming-job-executor thread pool
<https://github.com/jaceklaskowski/spark-streaming-notebook/blob/master/spark-streaming-jobscheduler.adoc#streaming-
ish of previous batch, if
you set "spark.streaming.concurrentJobs" larger than 1, then the
current batch could start without waiting for the previous batch (if
it is delayed), which will lead to unexpected results.
thomas lavocat <mailto:thomas.lavo...@univ-grenoble-alpes.fr>
"dependent" I mean this batch's job relies on the previous batch's result.
So this batch should wait for the finish of previous batch, if you set "
spark.streaming.concurrentJobs" larger than 1, then the current batch could
start without waiting for the previous b
, Saisai Shao wrote:
spark.streaming.concurrentJobs is a driver side internal
configuration, this means that how many streaming jobs can be
submitted concurrently in one batch. Usually this should not be
configured by user, unless you're familiar with Spark Streaming
.
>
> On 05/06/2018 11:24, Saisai Shao wrote:
>
> spark.streaming.concurrentJobs is a driver side internal configuration,
> this means that how many streaming jobs can be submitted concurrently in
> one batch. Usually this should not be configured by user, unless you're
> fami
Hello,
Thank's for your answer.
On 05/06/2018 11:24, Saisai Shao wrote:
spark.streaming.concurrentJobs is a driver side internal
configuration, this means that how many streaming jobs can be
submitted concurrently in one batch. Usually this should not be
configured by user, unless y
spark.streaming.concurrentJobs is a driver side internal configuration,
this means that how many streaming jobs can be submitted concurrently in
one batch. Usually this should not be configured by user, unless you're
familiar with Spark Streaming internals, and know the implication of
Hi everyone,
I'm wondering if the property spark.streaming.concurrentJobs should
reflects the total number of possible concurrent task on the cluster, or
the a local number of concurrent tasks on one compute node.
Thanks for your help.
T
DStream action, such as count, saveXXX, take.
For example, if "spark.streaming.concurrentJobs" is 1, and you call
DStream.count() twice. There will be two "count" Spark jobs and they will
run one by one. But if you set "spark.streaming.concurrentJobs" to 2, these
two
trying to speed up my spark streaming job, I found a presentation by
> Tathagata Das that mentions to increase value of
> "spark.streaming.concurrentJobs" if I have more than one output.
>
> In my spark streaming job I am reading from Kafka using Receiver Bases
> approach
Hi Folks,
I am trying to speed up my spark streaming job, I found a presentation by
Tathagata Das that mentions to increase value of
"spark.streaming.concurrentJobs" if I have more than one output.
In my spark streaming job I am reading from Kafka using Receiver Bases
approach and tr
11 matches
Mail list logo