Re: Flink kafka consumer stopped committing offsets

2018-06-19 Thread Juho Autio
Hi, Thanks for your analysis. We found LeaderElectionRateAndTimeMs go to non-zero value on Kafka around the same time when this error was seen in the Flink job. Kafka itself recovers from this and so do any other consumers that we have. It seems like a bug in kafka consumer library if this

Re: Flink kafka consumer stopped committing offsets

2018-06-11 Thread amit pal
Probably your kafka consumer is rebalancing. This can be due to a bigger message processing time due to which kafka broker is marking your consumer dead and rebalancing. This all happens before the consumer can commit the offsets. On Mon, Jun 11, 2018 at 7:37 PM Piotr Nowojski wrote: > The

Re: Flink kafka consumer stopped committing offsets

2018-06-11 Thread Piotr Nowojski
The more I look into it, the more it seems like a Kafka bug or some cluster failure from which your Kafka cluster did not recover. In your cases auto committing should be set to true and in that case KafkaConsumer should commit offsets once every so often when it’s polling messages. Unless for

Re: Flink kafka consumer stopped committing offsets

2018-06-11 Thread Juho Autio
Hi Piotr, thanks for your insights. > What’s your KafkaConsumer configuration? We only set these in the properties that are passed to FlinkKafkaConsumer010 constructor: auto.offset.reset=latest bootstrap.servers=my-kafka-host:9092 group.id=my_group

Re: Flink kafka consumer stopped committing offsets

2018-06-11 Thread Piotr Nowojski
Hi, What’s your KafkaConsumer configuration? Especially values for: - is checkpointing enabled? - enable.auto.commit (or auto.commit.enable for Kafka 0.8) / auto.commit.interval.ms - did you set setCommitOffsetsOnCheckpoints() ? Please also refer to

Flink kafka consumer stopped committing offsets

2018-06-08 Thread Juho Autio
Hi, We have a Flink stream job that uses Flink kafka consumer. Normally it commits consumer offsets to Kafka. However this stream ended up in a state where it's otherwise working just fine, but it isn't committing offsets to Kafka any more. The job keeps writing correct aggregation results to