[jira] [Updated] (KAFKA-6101) Reconnecting to broker does not exponentially backoff

Sean Rohead (JIRA) Fri, 20 Oct 2017 15:40:18 -0700

     [ 
https://issues.apache.org/jira/browse/KAFKA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sean Rohead updated KAFKA-6101:
-------------------------------
    Attachment: text.html

I'm fairly new to kafka so I'm not in a state to easily apply a source patch 
and try your fix. I can try to get the code checked out and apply your patch.

Your patch is pretty straightforward and I can tell you that the existing 
nodeState in the map has a ConnectionState.DISCONNECTED when it is being 
overwritten.

ClusterConnectionStates.disconnected is being called from NetworkClient.poll -- 
the line that reads: handleDisconnections(responses, updatedNow)

ClusterConnectionStates.connecting is called from the same method, but from the 
line that reads: long metadataTimeout = metadataUpdater.maybeUpdate(now)

I don't know enough about how the code is supposed to work, but it seems like 
every call to poll() is trying to connect and then immediately disconnecting 
when it fails, which means that we are always starting back at the beginning in 
terms of the number of retries and exponential backoff.

On Fri, Oct 20, 2017 at 3:56 PM, Ted Yu (JIRA) 
<j...@apache.org<mailto:j...@apache.org>> wrote:

     [ 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FKAFKA-6101%3Fpage%3Dcom.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel&data=02%7C01%7Csean_rohead%40hotmail.com%7C5486392685454feaca9008d518055bee%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636441333645078302&sdata=MKtcjijEV%2Bm%2FFdeKci34ANYhNHqv1N7Z7Opx51CtShQ%3D&reserved=0
 ]

Ted Yu updated KAFKA-6101:
--------------------------
    Attachment: 6101.v1.txt




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)



> Reconnecting to broker does not exponentially backoff
> -----------------------------------------------------
>
>                 Key: KAFKA-6101
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6101
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 0.11.0.0
>            Reporter: Sean Rohead
>         Attachments: 6101.v1.txt, text.html
>
>
> I am using com.typesafe.akka:akka-stream-kafka:0.17 which relies on 
> kafka-clients:0.11.0.0.
> I have set the reconnect.backoff.max.ms property to 60000.
> When I start the application without kafka running, I see a flood of the 
> following log message:
> [warn] o.a.k.c.NetworkClient - Connection to node -1 could not be 
> established. Broker may not be available.
> The log messages occur several times a second and the frequency of these 
> messages does not decrease over time as would be expected if exponential 
> backoff was working properly.
> I set a breakpoint in the debugger in ClusterConnectionStates:188 and noticed 
> that every time this breakpoint is hit, nodeState.failedAttempts is always 0. 
> This is why the delay does not increase exponentially. It also appears that 
> every time the breakpoint is hit, it is on a different instance, so even 
> though the number of failedAttempts is incremented, we never get the 
> breakpoint for the same instance more than one time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (KAFKA-6101) Reconnecting to broker does not exponentially backoff

Reply via email to