That's what I was looking for, thank you.

Unfortunately, neither

* spark.streaming.backpressure.initialRate
* spark.streaming.backpressure.enabled
* spark.streaming.receiver.maxRate
* spark.streaming.receiver.initialRate

change how many records I get (I tried many different combinations).

The only configuration that works is 
That's better than nothing, but I'd be useful to have backpressure enabled for 
automatic scaling.

Do you have any idea about why aren't backpressure working? How to debug this?

On 10/11/2016 06:08 PM, Cody Koeninger wrote:

"This rate is upper bounded by the values
spark.streaming.receiver.maxRate and
spark.streaming.kafka.maxRatePerPartition if they are set (see

On Tue, Oct 11, 2016 at 10:57 AM, Samy Dindane <> wrote:

Is it possible to limit the size of the batches returned by the Kafka
consumer for Spark Streaming?
I am asking because the first batch I get has hundred of millions of records
and it takes ages to process and checkpoint them.

Thank you.


To unsubscribe e-mail:

To unsubscribe e-mail:

Reply via email to