[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117378#comment-16117378
 ] 

Dan Benediktson commented on ZOOKEEPER-2869:
--------------------------------------------

Our fork actually has solved this problem, using a standard jittered 
exponential backoff algorithm (the problem being addressed there was partially 
around addressing thundering herds, so jittering was deemed necessary for that).

I wouldn't mind porting our code and offering a patch for it; anything that 
gets us closer to upstream is goodness. However, we really need to take the fix 
I provided a year ago for ZOOKEEPER-2471 before doing this, otherwise allowing 
higher backoff than 1 second will dramatically increase the likelihood of 
clients getting completely wedged in a sleep/retry loop.

> Allow for exponential backoff in ClientCnxn.SendThread on connection 
> re-establishment
> -------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2869
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2869
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: java client
>    Affects Versions: 3.4.10, 3.5.3
>            Reporter: Nick Travers
>            Priority: Minor
>
> As part of ZOOKEEPER-961, when the client re-establishes a connection to the 
> server, it will sleep for a random number of milliseconds in the range [0, 
> 1000). Introduced 
> [here|https://github.com/apache/zookeeper/commit/d84dc077d576b7cdfbfd003e3425fab85ca29a44].
> These reconnects can cause excessive logging in clients if the server is 
> unavailable for an extended period of time, with reconnects every 500ms on 
> average.
> One solution could be to allow for exponential backoff in the client. The 
> backoff params could be made configurable.
> [3.5.x 
> code|https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1059].
> [3.4.x 
> code|https://github.com/apache/zookeeper/blob/release-3.4.9/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1051].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to