You want spark.streaming.kafka.maxRatePerPartition for the direct stream. On Sat, Mar 18, 2017 at 3:37 PM, Mal Edwin <mal.ed...@vinadionline.com> wrote: > > Hi, > You can enable backpressure to handle this. > > spark.streaming.backpressure.enabled > spark.streaming.receiver.maxRate > > Thanks, > Edwin > > On Mar 18, 2017, 12:53 AM -0400, sagarcasual . <sagarcas...@gmail.com>, > wrote: > > Hi, we have spark 1.6.1 streaming from Kafka (0.10.1) topic using direct > approach. The streaming part works fine but when we initially start the job, > we have to deal with really huge Kafka message backlog, millions of > messages, and that first batch runs for over 40 hours, and after 12 hours > or so it becomes very very slow, it keeps crunching messages, but at a very > low speed. Any idea how to overcome this issue? Once the job is all caught > up, subsequent batches are quick and fast since the load is really tiny to > process. So any idea how to avoid this problem? > > >
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org