Mike Heffner created ZOOKEEPER-1506:
---------------------------------------
Summary: Re-try DNS hostname -> IP resolution if node connection
fails
Key: ZOOKEEPER-1506
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506
Project: ZooKeeper
Issue Type: Improvement
Components: server
Affects Versions: 3.3.5
Environment: Ubuntu 11.04 64-bit
Reporter: Mike Heffner
Priority: Minor
In our zoo.cfg we use hostnames to identify the ZK servers that are part of an
ensemble. These hostnames are configured with a low (<= 60s) TTL and the IP
address they map to can and does change. Our procedure for replacing/upgrading
a ZK node is to boot an entirely new instance and remap the hostname to the new
instance's IP address. Our expectation is that when the original ZK node is
terminated/shutdown, the remaining nodes in the ensemble would reconnect to the
new instance.
However, what we are noticing is that the remaining ZK nodes do not attempt to
re-resolve the hostname->IP mapping for the new server. Once the original ZK
node is terminated, the existing servers continue to attempt contacting it at
the old IP address. It would be great if the ZK servers could try to re-resolve
the hostname when attempting to connect to a lost ZK server, instead of caching
the lookup indefinitely. Currently we must do a rolling restart of the ZK
ensemble after swapping a node -- which at three nodes means we periodically
lose quorum.
The exact method we are following is to boot new instances in EC2 and attach
one, of a set of three, Elastic IP address. External to EC2 this IP address
remains the same and maps to whatever instance it is attached to. Internal to
EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped
to the internal (10.x.y.z) address of the instance it is attached to.
Therefore, in our case we would like ZK to pickup the new 10.x.y.z address that
the elastic IP hostname gets mapped to and reconnect appropriately.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira