Hi,

One of my Kafka 0.9.0.1 clusters (3 brokers,
default.replication.factor=2) that has been working fine until yesterday.
The message volume was pretty low. There were no obvious problems except....

The first symptom was *kafka-consumer-groups.sh* failing with an empty.head
exception.

When I used *kafka-topics --describe* I saw that one of the brokers was no
longer part of the appropriate ISRs.
Restarting that broker appeared not solve the problem.
In fact, I got the impression that the broker was temporarily in the ISR
and then left again.

I think I restarted each broker and eventually things returned to normal.
The problem then reoccurred a couple of hours later.

During this time, I also had a problem with one of the Kafka 0.8.2.1
clients:

ERROR kafka.consumer.ConsumerFetcherThread -
[ConsumerFetcherThread-mytopic-consumer-81a939d49903-1474231957102-90b6fc16-0-6],
Current offset 5488 for partition [mytopic,5] out of range; reset offset to
2340\n","stream":"stdout","time":"2016-09-18T21:13:54.151815047Z"}

This topic partition had >5488 messages so there offset was definitely not
out of range. The result was that the consumer reprocessed old messages.
The lag as reported by kafka-consumer-groups.sh when from <10 to > 2500

Thoughts? Recommendations for debugging this problem when it occurs again?

Chris

Reply via email to