Does spark.streaming.concurrentJobs still exist?

2018-10-09 Thread kant kodali
Does spark.streaming.concurrentJobs still exist? spark.streaming.concurrentJobs (default: 1) is the number of concurrent jobs, i.e. threads in streaming-job-executor thread pool <https://github.com/jaceklaskowski/spark-streaming-notebook/blob/master/spark-streaming-jobscheduler.adoc#streaming-

Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-11 Thread thomas lavocat
ish of previous batch, if you set "spark.streaming.concurrentJobs" larger than 1, then the current batch could start without waiting for the previous batch (if it is delayed), which will lead to unexpected results. thomas lavocat <mailto:thomas.lavo...@univ-grenoble-alpes.fr>

Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-05 Thread Saisai Shao
"dependent" I mean this batch's job relies on the previous batch's result. So this batch should wait for the finish of previous batch, if you set " spark.streaming.concurrentJobs" larger than 1, then the current batch could start without waiting for the previous b

Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-05 Thread thomas lavocat
, Saisai Shao wrote: spark.streaming.concurrentJobs is a driver side internal configuration, this means that how many streaming jobs can be submitted concurrently in one batch. Usually this should not be configured by user, unless you're familiar with Spark Streaming

Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-05 Thread Saisai Shao
. > > On 05/06/2018 11:24, Saisai Shao wrote: > > spark.streaming.concurrentJobs is a driver side internal configuration, > this means that how many streaming jobs can be submitted concurrently in > one batch. Usually this should not be configured by user, unless you're > fami

Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-05 Thread thomas lavocat
Hello, Thank's for your answer. On 05/06/2018 11:24, Saisai Shao wrote: spark.streaming.concurrentJobs is a driver side internal configuration, this means that how many streaming jobs can be submitted concurrently in one batch. Usually this should not be configured by user, unless y

Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-05 Thread Saisai Shao
spark.streaming.concurrentJobs is a driver side internal configuration, this means that how many streaming jobs can be submitted concurrently in one batch. Usually this should not be configured by user, unless you're familiar with Spark Streaming internals, and know the implication of

[Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-05 Thread thomas lavocat
Hi everyone, I'm wondering if the property  spark.streaming.concurrentJobs should reflects the total number of possible concurrent task on the cluster, or the a local number of concurrent tasks on one compute node. Thanks for your help. T

Re: spark.streaming.concurrentJobs

2015-09-28 Thread Shixiong Zhu
DStream action, such as count, saveXXX, take. For example, if "spark.streaming.concurrentJobs" is 1, and you call DStream.count() twice. There will be two "count" Spark jobs and they will run one by one. But if you set "spark.streaming.concurrentJobs" to 2, these two

Re: spark.streaming.concurrentJobs

2015-09-25 Thread Atul Kulkarni
trying to speed up my spark streaming job, I found a presentation by > Tathagata Das that mentions to increase value of > "spark.streaming.concurrentJobs" if I have more than one output. > > In my spark streaming job I am reading from Kafka using Receiver Bases > approach

spark.streaming.concurrentJobs

2015-09-24 Thread Atul Kulkarni
Hi Folks, I am trying to speed up my spark streaming job, I found a presentation by Tathagata Das that mentions to increase value of "spark.streaming.concurrentJobs" if I have more than one output. In my spark streaming job I am reading from Kafka using Receiver Bases approach and tr