You can't change the batch time, but you can limit the number of items in the batch
http://spark.apache.org/docs/latest/configuration.html spark.streaming.backpressure.enabled spark.streaming.kafka.maxRatePerPartition On Tue, Jan 3, 2017 at 4:00 AM, 周家帅 <zji...@gmail.com> wrote: > Hi, > > I am an intermediate spark user and have some experience in large data > processing. I post this question in StackOverflow but receive no response. > My problem is as follows: > > I use createDirectStream in my spark streaming application. I set the batch > interval to 7 seconds and most of the time the batch job can finish within > about 5 seconds. However, in very rare cases, the batch job need cost 60 > seconds and this will delay some batches of jobs. To cut down the total > delay time for these batches, I hope I can process more streaming data which > spread over the delayed jobs at one time. This will help the streaming > return to normal as soon as possible. > > So, I want to know there is some method to dynamically update/merge batch > size of input for spark and kafka when delay appears. > > Many thanks for your help. > > > -- > Jiashuai Zhou > > School of Electronics Engineering and Computer Science, > Peking University > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org