[jira] [Commented] (KAFKA-10520) InitProducerId may be blocked if least loaded node is not ready to send

2020-10-07 Thread Rajini Sivaram (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209737#comment-17209737
 ] 

Rajini Sivaram commented on KAFKA-10520:


[~ableegoldman] Yes, will try and get this done in time for 2.7.0 code freeze.

> InitProducerId may be blocked if least loaded node is not ready to send
> ---
>
> Key: KAFKA-10520
> URL: https://issues.apache.org/jira/browse/KAFKA-10520
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Major
> Fix For: 2.7.0
>
>
> From the logs of a failing producer that shows InitProducerId timing out 
> after request timeout, it looks like we don't poll while waiting for 
> transactional producer to be initialized and FindCoordinator request cannot 
> be sent. The producer configuration used one bootstrap server and 
> `max.in.flight.requests.per.connection=1`. The failing sequence:
>  # Producer sends MetadataRequest to least loaded node (bootstrap server)
>  # Producer is ready to send InitProducerId, needs to find transaction 
> coordinator
>  # Producer creates FindCoordinator request, but the only node known is the 
> bootstrap server. Producer cannot send to this node since there is already 
> the Metadata request in flight and max.inflight is 1.
>  # Producer waits without polling, so Metadata response is not processed. 
> InitProducerId times out eventually.
>   
>  
>  We need to update the condition used to determine whether Sender should 
> poll() to fix this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10520) InitProducerId may be blocked if least loaded node is not ready to send

2020-10-06 Thread Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17208995#comment-17208995
 ] 

Sophie Blee-Goldman commented on KAFKA-10520:
-

Hey [~rsivaram], is this something we can get fixed for 2.7? I'm just asking 
because the freeze deadlines are approaching, and this seems like it might be a 
simple fix for a pretty much fatal error (although workarounds do exist)

> InitProducerId may be blocked if least loaded node is not ready to send
> ---
>
> Key: KAFKA-10520
> URL: https://issues.apache.org/jira/browse/KAFKA-10520
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Reporter: Rajini Sivaram
>Priority: Major
> Fix For: 2.7.0
>
>
> From the logs of a failing producer that shows InitProducerId timing out 
> after request timeout, it looks like we don't poll while waiting for 
> transactional producer to be initialized and FindCoordinator request cannot 
> be sent. The producer configuration used one bootstrap server and 
> `max.in.flight.requests.per.connection=1`. The failing sequence:
>  # Producer sends MetadataRequest to least loaded node (bootstrap server)
>  # Producer is ready to send InitProducerId, needs to find transaction 
> coordinator
>  # Producer creates FindCoordinator request, but the only node known is the 
> bootstrap server. Producer cannot send to this node since there is already 
> the Metadata request in flight and max.inflight is 1.
>  # Producer waits without polling, so Metadata response is not processed. 
> InitProducerId times out eventually.
>   
>  
>  We need to update the condition used to determine whether Sender should 
> poll() to fix this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)