[ https://issues.apache.org/jira/browse/KAFKA-13563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17465463#comment-17465463 ]
James Olsen commented on KAFKA-13563: ------------------------------------- [~showuon] I've attached a reproducer ({{{}kafka.zip{}}}). It includes a {{docker-compose.yml}} that brings up a 3 node cluster and a {{Main}} class with Producer and Consumer. P.S. The easiest way to find the current coordinator is to search the logs for `discovered`. > Consumer failure after rolling Broker upgrade > --------------------------------------------- > > Key: KAFKA-13563 > URL: https://issues.apache.org/jira/browse/KAFKA-13563 > Project: Kafka > Issue Type: Bug > Components: clients > Reporter: Luke Chen > Assignee: Luke Chen > Priority: Major > Attachments: kafka.zip > > > This failure occurred again during this month's rolling OS security updates > to the Brokers (no change to Broker version). I have also been able to > reproduce it locally with the following process: > > 1. Start a 3 Broker cluster with a Topic having Replicas=3. > 2. Start a Client with Producer and Consumer communicating over the Topic. > 3. Stop the Broker that is acting as the Group Coordinator. > 4. Observe successful Rediscovery of new Group Coordinator. > 5. Restart the stopped Broker. > 6. Stop the Broker that became the new Group Coordinator at step 4. > 7. Observe "Rediscovery will be attempted" message but no "Discovered group > coordinator" message. > > In short, Group Coordinator Rediscovery only works for the first Broker > failover not any subsequent failover. > > I conducted tests using 2.7.1 servers. The issue occurs with 2.7.1 and 2.7.2 > Clients. The issue does not occur with 2.5.1 and 2.7.0 Clients. This make > me suspect that https://issues.apache.org/jira/browse/KAFKA-10793 introduced > this issue. > > > -- This message was sent by Atlassian Jira (v8.20.1#820001)