Rhys Yarranton created ZOOKEEPER-3825:
-----------------------------------------

             Summary: StaticHostProvider.updateServerList address matching 
fails when connectString uses IP addresses
                 Key: ZOOKEEPER-3825
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3825
             Project: ZooKeeper
          Issue Type: Bug
          Components: java client
    Affects Versions: 3.5.5
            Reporter: Rhys Yarranton


StaticHostProvider.updateServerList contains address matching like this:
{code:java}
        for (InetSocketAddress addr : shuffledList) {
            if (addr.getPort() == myServer.getPort()
                    && ((addr.getAddress() != null
                            && myServer.getAddress() != null && addr
                            .getAddress().equals(myServer.getAddress())) || addr
                            .getHostString().equals(myServer.getHostString()))) 
{
                myServerInNewConfig = true;
                break;
            }
        }
{code}
 

The addresses in shuffledList are unresolved, while the current server address 
in myServer is a resolved address (coming from a socket).  If the connect 
string is expressed in terms of IP addresses instead of host names, the two 
won't match even when they represent the same server.

On the unresolved addresses, getAddress() is null, and getHostString() is 
something like 1.2.3.4.  On the resolved address, getAddress() is not null, and 
getHostString() is (normally) the canonical host name corresponding to the IP 
address.

As a result, this method tends to return true (reconfig) when it should not.  
The calling method, ZooKeeper.updateServerList then closes the connection.

This might be written off as not too serious, except that Curator calls this 
method when there is a connection state change.  (Sometimes many times.)  What 
we observe is that when the client has to reconnect, _e.g._, if there is a 
server failure, when it reconnects the socket gets closed right away.  It goes 
into a cycle of death until the session dies and a new one is created.  (This 
doesn't seem like very nice behaviour on Curator's behalf, but that's what's 
out there.)

As a workaround, we implemented a custom HostProvider to filter out calls to 
updateServerList which don't actually change the list.

As a permanent fix, instead of passing the current host based on the socket 
remote address, may need to remember the unresolved address that was used to 
connect.  (Or use the original strings.)

Filed this against 3.5.5.  Based on source control, it looks this still in 
exists on master at time of writing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to