Does spark.streaming.concurrentJobs still exist?

2018-10-09 Thread kant kodali
Does spark.streaming.concurrentJobs still exist? spark.streaming.concurrentJobs (default: 1) is the number of concurrent jobs, i.e. threads in streaming-job-executor thread pool <https://github.com/jaceklaskowski/spark-streaming-notebook/blob/master/spark-streaming-jobscheduler.adoc#streaming-

Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-11 Thread thomas lavocat
ious batch, if you set "spark.streaming.concurrentJobs" larger than 1, then the current batch could start without waiting for the previous batch (if it is delayed), which will lead to unexpected results. thomas lavocat <mailto:thomas.lavo...@univ-grenoble-alpes.fr>> 于2018年6月5日

Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-05 Thread Saisai Shao
"dependent" I mean this batch's job relies on the previous batch's result. So this batch should wait for the finish of previous batch, if you set " spark.streaming.concurrentJobs" larger than 1, then the current batch could start without waiting for the previous batch (if i

Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-05 Thread thomas lavocat
hao wrote: spark.streaming.concurrentJobs is a driver side internal configuration, this means that how many streaming jobs can be submitted concurrently in one batch. Usually this should not be configured by user, unless you're familiar with Spark Streaming internals,

Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-05 Thread Saisai Shao
; > On 05/06/2018 11:24, Saisai Shao wrote: > > spark.streaming.concurrentJobs is a driver side internal configuration, > this means that how many streaming jobs can be submitted concurrently in > one batch. Usually this should not be configured by user, unless you're > familiar

Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-05 Thread thomas lavocat
Hello, Thank's for your answer. On 05/06/2018 11:24, Saisai Shao wrote: spark.streaming.concurrentJobs is a driver side internal configuration, this means that how many streaming jobs can be submitted concurrently in one batch. Usually this should not be configured by user, unless you're

Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-05 Thread Saisai Shao
spark.streaming.concurrentJobs is a driver side internal configuration, this means that how many streaming jobs can be submitted concurrently in one batch. Usually this should not be configured by user, unless you're familiar with Spark Streaming internals, and know the implication

[Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ?

2018-06-05 Thread thomas lavocat
Hi everyone, I'm wondering if the property  spark.streaming.concurrentJobs should reflects the total number of possible concurrent task on the cluster, or the a local number of concurrent tasks on one compute node. Thanks for your help. Thomas

Re: spark.streaming.concurrentJobs

2015-09-28 Thread Shixiong Zhu
s the DStream action, such as count, saveXXX, take. For example, if "spark.streaming.concurrentJobs" is 1, and you call DStream.count() twice. There will be two "count" Spark jobs and they will run one by one. But if you set "spark.streaming.concurrentJobs" to 2, these two

Re: spark.streaming.concurrentJobs

2015-09-25 Thread Atul Kulkarni
> Hi Folks, > > I am trying to speed up my spark streaming job, I found a presentation by > Tathagata Das that mentions to increase value of > "spark.streaming.concurrentJobs" if I have more than one output. > > In my spark streaming job I am reading from

spark.streaming.concurrentJobs

2015-09-25 Thread Atul Kulkarni
Hi Folks, I am trying to speed up my spark streaming job, I found a presentation by Tathagata Das that mentions to increase value of "spark.streaming.concurrentJobs" if I have more than one output. In my spark streaming job I am reading from Kafka using Receiver Bases approach and tr