[ 
https://issues.apache.org/jira/browse/CURATOR-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129856#comment-17129856
 ] 

Cam McKenzie commented on CURATOR-570:
--------------------------------------

[~randgalt], the code in question is in EnsembleTracker.
{code:java}
public static String configToConnectionString(QuorumVerifier data) throws 
Exception{code}
This turns the QuorumVerifier data returned from Zookeeper into a connection 
string that is used by Curator.

There are 2 minor issues.

1.) Zookeeper always returns IPs (based on my testing, not 100% sure on this) 
in the config update calls. So, even if the user of Curator provides the full 
set of ZK nodes in the connection string, if they are provided as hostnames 
rather than IPs, then when we get a config event, this always results in an 
update to the connection string.

2.) The ordering of IP addresses may not be deterministic. I haven't seen 
ordering issues, by [~ryarran] has mentioned that he has. On the ZK side, I 
believe the data is stored in a map before streaming, so ordering is not 
guaranteed.

So, one possible solution is to

-Enforce ordering of IP addresses on the Curator side. I don't believe this has 
any effect on how connections to ZK are actually established. From memory, a 
host is chosen at random to connect to to try and distribute load.

-Execute the conversion of hostnames to IP addresses and ordering them on the 
initial connection string provided to Curator. For the case where the specified 
connection string contains all configured ZK nodes, this prevents one 
unnecessary call to updateServerList() for the case where the client provides 
hostnames and ZK returns IP addresses.

> Excessive calls to ZooKeeper.updateServerList (which can result in session 
> death)
> ---------------------------------------------------------------------------------
>
>                 Key: CURATOR-570
>                 URL: https://issues.apache.org/jira/browse/CURATOR-570
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework
>    Affects Versions: 4.2.0, 4.3.0
>            Reporter: Rhys Yarranton
>            Priority: Major
>
> On suspend and reconnect, Curator calls ZooKeeper.updateServerList via 
> ConnectionState.checkState --> ConnectionState.handleNewConnectionString.  In 
> addition, recipes may be triggered by this as well, and they too make calls 
> ZooKeeper.updateServerList via ConnectState.checkTimeouts --> 
> ConnectionState.handleNewConnectionString.
> This happens even though the connection string has not actually changed.
> Due to ZOOKEEPER-3825, this can cause the connection to be closed 
> immediately.  On its own this would be perceived as a glitch.  But due to the 
> Curator-induced calls, what we see is a cycle of SUSPENDED/RECONNECTED, until 
> eventually the session dies and a new session is recreated.
> Based on the source code (at time of writing), ZooKeeper.updateServerList is 
> not intended to be called frequently like this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to