[jira] [Commented] (KAFKA-10520) InitProducerId may be blocked if least loaded node is not ready to send
[ https://issues.apache.org/jira/browse/KAFKA-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209737#comment-17209737 ] Rajini Sivaram commented on KAFKA-10520: [~ableegoldman] Yes, will try and get this done in time for 2.7.0 code freeze. > InitProducerId may be blocked if least loaded node is not ready to send > --- > > Key: KAFKA-10520 > URL: https://issues.apache.org/jira/browse/KAFKA-10520 > Project: Kafka > Issue Type: Bug > Components: producer >Reporter: Rajini Sivaram >Assignee: Rajini Sivaram >Priority: Major > Fix For: 2.7.0 > > > From the logs of a failing producer that shows InitProducerId timing out > after request timeout, it looks like we don't poll while waiting for > transactional producer to be initialized and FindCoordinator request cannot > be sent. The producer configuration used one bootstrap server and > `max.in.flight.requests.per.connection=1`. The failing sequence: > # Producer sends MetadataRequest to least loaded node (bootstrap server) > # Producer is ready to send InitProducerId, needs to find transaction > coordinator > # Producer creates FindCoordinator request, but the only node known is the > bootstrap server. Producer cannot send to this node since there is already > the Metadata request in flight and max.inflight is 1. > # Producer waits without polling, so Metadata response is not processed. > InitProducerId times out eventually. > > > We need to update the condition used to determine whether Sender should > poll() to fix this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10520) InitProducerId may be blocked if least loaded node is not ready to send
[ https://issues.apache.org/jira/browse/KAFKA-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208995#comment-17208995 ] Sophie Blee-Goldman commented on KAFKA-10520: - Hey [~rsivaram], is this something we can get fixed for 2.7? I'm just asking because the freeze deadlines are approaching, and this seems like it might be a simple fix for a pretty much fatal error (although workarounds do exist) > InitProducerId may be blocked if least loaded node is not ready to send > --- > > Key: KAFKA-10520 > URL: https://issues.apache.org/jira/browse/KAFKA-10520 > Project: Kafka > Issue Type: Bug > Components: producer >Reporter: Rajini Sivaram >Priority: Major > Fix For: 2.7.0 > > > From the logs of a failing producer that shows InitProducerId timing out > after request timeout, it looks like we don't poll while waiting for > transactional producer to be initialized and FindCoordinator request cannot > be sent. The producer configuration used one bootstrap server and > `max.in.flight.requests.per.connection=1`. The failing sequence: > # Producer sends MetadataRequest to least loaded node (bootstrap server) > # Producer is ready to send InitProducerId, needs to find transaction > coordinator > # Producer creates FindCoordinator request, but the only node known is the > bootstrap server. Producer cannot send to this node since there is already > the Metadata request in flight and max.inflight is 1. > # Producer waits without polling, so Metadata response is not processed. > InitProducerId times out eventually. > > > We need to update the condition used to determine whether Sender should > poll() to fix this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)