[ https://issues.apache.org/jira/browse/KAFKA-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Mao resolved KAFKA-16395. ------------------------------- Resolution: Not A Bug > Producer should refresh metadata on a socket request timeout > ------------------------------------------------------------ > > Key: KAFKA-16395 > URL: https://issues.apache.org/jira/browse/KAFKA-16395 > Project: Kafka > Issue Type: Bug > Reporter: David Mao > Assignee: David Mao > Priority: Critical > > I noticed in a set of producer logs that on a broker outage, we saw the > following sequence of logs: > Got error produce response with correlation id 1661616 on topic-partition > topic-0, retrying (2147483646 attempts left). Error: REQUEST_TIMED_OUT. Error > Message: Disconnected from node 0 due to timeout > Got error produce response with correlation id 1662093 on topic-partition > topic-0, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER > Received invalid metadata error in produce request on partition topic-0 due > to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests > intended only for the leader, this error indicates that the broker is not the > current leader. For requests intended for any replica, this error indicates > that the broker is not a replica of the topic partition.. Going to request > metadata update now > this implies we did not request metadata between our produce request > attempts. This is a regression introduced by > https://issues.apache.org/jira/browse/KAFKA-14317. -- This message was sent by Atlassian Jira (v8.20.10#820010)