Kafka consumer group keeps moving to PreparingRebalance and stops consuming

Avshalom Manevich Fri, 06 Dec 2019 07:13:10 -0800

We have a Kafka Streams consumer group that keep moving to
PreparingRebalance state and stop consuming. The pattern is as follows:


   1.

   Consumer group is running and stable for around 20 minutes
   2.

   New consumers (members) start to appear in the group state without any
   clear reason, these new members only originate from a small number of VMs
   (not the same VMs each time), and they keep joining
   3. Group state changes to PreparingRebalance
   4. All consumers stop consuming, showing these logs: "Group coordinator
   ... is unavailable or invalid, will attempt rediscovery"
   5. The consumer on VMs that generated extra members show these logs:

Offset commit failed on partition X at offset Y: The coordinator is not
aware of this member.

Failed to commit stream task X since it got migrated to another thread
already. Closing it as zombie before triggering a new rebalance.

Detected task Z that got migrated to another thread. This implies that this
thread missed a rebalance and dropped out of the consumer group. Will try
to rejoin the consumer group.


   1. We kill all consumer processes on all VMs, the group moves to Empty
   with 0 members, we start the processes and we're back to step 1

Kafka version is 1.1.0, streams version is 2.0.0

We took thread dumps from the misbehaving consumers, and didn't see more
consumer threads than configured.

We tried restarting kafka brokers, cleaning zookeeper cache.

We suspect that the issue has to do with missing heartbeats, but the
default heartbeat is 3 seconds and message handling times are no where near
that.

Anyone encountered a similar behaviour?

Kafka consumer group keeps moving to PreparingRebalance and stops consuming

Reply via email to