Github user ijuma commented on the issue:
https://github.com/apache/zookeeper/pull/451
Really happy to see this in a released version, thanks everyone.
@anmolnar after testing this with Apache Kafka, it seems like the
combination of a reasonably long backoff (1 second) with randomization when the
list is small can cause connection timeouts in some cases that one would not
expect. The concrete example we saw was usage of localhost which caused ipv4
and ipv6 addresses to be resolved while the only the ipv4 one worked. We had
tests that would fail quite often with a connection timeout of 6 seconds
because the ipv6 one would be picked continuously.
@rajinisivaram filed a ZooKeeper issue to track a potential future
improvement to this logic:
https://issues.apache.org/jira/browse/ZOOKEEPER-3100
---