Re: Spark Streaming on Compact Kafka topic - consumers 1 message per partition per batch

2020-04-08 Thread Hrishikesh Mishra
It seems, I found the issue. The actual problem is something related to back pressure. When I am adding these config *spark.streaming.kafka.maxRatePerPartition* or *spark.streaming.backpressure.initialRate* (the of these configs are 100). After that it starts consuming one message per partition per

Re: Spark Streaming on Compact Kafka topic - consumers 1 message per partition per batch

2020-04-01 Thread Waleed Fateem
Well this is interesting. Not sure if this is the expected behavior. The log messages you have referenced are actually printed out by the Kafka Consumer itself (org.apache.kafka.clients.consumer.internals.Fetcher). That log message belongs to a new feature added starting with Kafka 1.1: https://is

Spark Streaming on Compact Kafka topic - consumers 1 message per partition per batch

2020-03-31 Thread Hrishikesh Mishra
Hi Our Spark streaming job was working fine as expected (the number of events to process in a batch). But due to some reasons, we added compaction on Kafka topic and restarted the job. But after restart it was failing for below reason: org.apache.spark.SparkException: Job aborted due to stage fa