Alex Rankin created CURATOR-392:
-----------------------------------

             Summary: Zookeeper Ensemble Get Incorrect Address
                 Key: CURATOR-392
                 URL: https://issues.apache.org/jira/browse/CURATOR-392
             Project: Apache Curator
          Issue Type: Bug
          Components: Framework
    Affects Versions: 3.2.1
         Environment: ZooKeeper 3.5.1-alpha
            Reporter: Alex Rankin


I've noticed an issue with Curator 3.2.1 which relates to the fix from 
CURATOR-345 (also reported by me).

When we would reconnect after losing connection to Zookeeper (due to network 
issues), our services would always have the wrong connection string, and never 
manage to reconnect to the Zookeeper cluster. Assuming that 10.1.2.3 is our 
zookeeper server, and we have two scenarios (with different zoo.cfg files) we 
were seeing the following results when a reconnection was established:

{quote}
*Scenario 1:* ClientCnxn - Opening socket connection to server 
0.0.0.0/0.0.0.0:2181.
*Scenario 2:* ClientCnxn - Opening socket connection to server 
10.1.2.3/10.1.2.3:2888.
{quote}

Obviously these are both undesirable connection strings, as both are wrong. The 
issue arises in the EnsembleTracker.processConfigData() when we reconnect to 
Zookeeper. The config coming from zookeeper is in [the 
format|https://zookeeper.apache.org/doc/trunk/zookeeperReconfig.html#sc_reconfig_clientport]:

{quote}
server.<positive id> = <address1>:<port1>:<port2>\[:role\];\[<client port 
address>:\]<client port>
{quote}

As we can see, both \[:role\] and \[<client port address>:\] are optional. 
Hence, the following string is perfectly valid:

{quote}
server.1=10.1.2.3:2888:3888:participant;2181
{quote}

When Zookeeper sends this, it defaults the clientAddress to 0.0.0.0, so we 
retrieve the following value in EnsembleTracker:

{quote}
server.1=10.1.2.3:2888:3888:participant;0.0.0.0:2181
{quote}

The resulting connection string, therefore, turns in to 0.0.0.0:2181 instead of 
10.1.2.3:2181, and Curator creates a new ZooKeeper to connect to that IP - 
which obviously never works.

In the second scenario, our connection string looks a bit different. It is 
wrong according to the docs, but is valid:

{quote}
server.1=10.1.2.3:2888:3888:participant
{quote}

Now, this is missing the client port and address. That means that the resulting 
string from the EnsembleTracker is 10.1.2.3:2888 - which isn't desired. 
Including the port would just lead to the above scenario.

>From what I can see, the EnsembleTracker.configToConnectionString() method is 
>the issue here:

{code}
InetSocketAddress address = Objects.firstNonNull(server.clientAddr, 
server.addr);
            
sb.append(address.getAddress().getHostAddress()).append(":").append(address.getPort());
{code}

In the above cases, both the server.Addr and server.clientAddr values are 
wrong. We also prefer the value of clientAddr for some reason, which doesn't 
look right to me (given that it can be 0.0.0.0 or 127.0.0.1).

It seems to me that Curator should use server.Addr.getHostAddress() with 
server.clientAddr.getPort(). When the clientAddr is missing, however, I'm not 
sure what should be done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to