[ https://issues.apache.org/jira/browse/KAFKA-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209737#comment-17209737 ]
Rajini Sivaram commented on KAFKA-10520: ---------------------------------------- [~ableegoldman] Yes, will try and get this done in time for 2.7.0 code freeze. > InitProducerId may be blocked if least loaded node is not ready to send > ----------------------------------------------------------------------- > > Key: KAFKA-10520 > URL: https://issues.apache.org/jira/browse/KAFKA-10520 > Project: Kafka > Issue Type: Bug > Components: producer > Reporter: Rajini Sivaram > Assignee: Rajini Sivaram > Priority: Major > Fix For: 2.7.0 > > > From the logs of a failing producer that shows InitProducerId timing out > after request timeout, it looks like we don't poll while waiting for > transactional producer to be initialized and FindCoordinator request cannot > be sent. The producer configuration used one bootstrap server and > `max.in.flight.requests.per.connection=1`. The failing sequence: > # Producer sends MetadataRequest to least loaded node (bootstrap server) > # Producer is ready to send InitProducerId, needs to find transaction > coordinator > # Producer creates FindCoordinator request, but the only node known is the > bootstrap server. Producer cannot send to this node since there is already > the Metadata request in flight and max.inflight is 1. > # Producer waits without polling, so Metadata response is not processed. > InitProducerId times out eventually. > > > We need to update the condition used to determine whether Sender should > poll() to fix this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)