One of my Kafka 0.9.0.1 clusters (3 brokers,
default.replication.factor=2) that has been working fine until yesterday.
The message volume was pretty low. There were no obvious problems except....
The first symptom was *kafka-consumer-groups.sh* failing with an empty.head
When I used *kafka-topics --describe* I saw that one of the brokers was no
longer part of the appropriate ISRs.
Restarting that broker appeared not solve the problem.
In fact, I got the impression that the broker was temporarily in the ISR
and then left again.
I think I restarted each broker and eventually things returned to normal.
The problem then reoccurred a couple of hours later.
During this time, I also had a problem with one of the Kafka 0.8.2.1
ERROR kafka.consumer.ConsumerFetcherThread -
Current offset 5488 for partition [mytopic,5] out of range; reset offset to
This topic partition had >5488 messages so there offset was definitely not
out of range. The result was that the consumer reprocessed old messages.
The lag as reported by kafka-consumer-groups.sh when from <10 to > 2500
Thoughts? Recommendations for debugging this problem when it occurs again?