David Mao created KAFKA-16395: --------------------------------- Summary: Producer should refresh metadata on a socket request timeout Key: KAFKA-16395 URL: https://issues.apache.org/jira/browse/KAFKA-16395 Project: Kafka Issue Type: Bug Reporter: David Mao Assignee: David Mao
I noticed in a set of producer logs that on a broker outage, we saw the following sequence of logs: Got error produce response with correlation id 1661616 on topic-partition topic-0, retrying (2147483646 attempts left). Error: REQUEST_TIMED_OUT. Error Message: Disconnected from node 0 due to timeout Got error produce response with correlation id 1662093 on topic-partition topic-0, retrying (2147483646 attempts left). Error: NOT_LEADER_OR_FOLLOWER Received invalid metadata error in produce request on partition topic-0 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.. Going to request metadata update now this implies we did not request metadata between our produce request attempts. This is a regression introduced by https://issues.apache.org/jira/browse/KAFKA-14317. -- This message was sent by Atlassian Jira (v8.20.10#820010)