Thanks for catching this, Eron. It looks like the port to 3.5 misses changes as
you correctly pointed out:
https://github.com/apache/zookeeper/commit/d2a49163b7bc7c9589140dbba7f60e591028f908
<https://github.com/apache/zookeeper/commit/d2a49163b7bc7c9589140dbba7f60e591028f908>
In particular, changes in Learner.java. I would say this should definitely be
in 3.5.4.
-Flavio
> On 20 Feb 2018, at 01:14, Eron Wright <[email protected]> wrote:
>
> Hello,
>
> I attempted to run ZK 3.5.3-beta in a Kubernetes cluster, using the typical
> approach of a StatefulSet plus a pair of Services. I observed that some
> of my ZK servers would fail to resolve the DNS addresses of its peers
> indefinitely. It is normal that addresses cannot be resolved immediately
> at startup because the records are created asynchronously by Kubernetes.
> One would expect ZK to keep trying and eventually succeed. Note that
> this issue affects 3.5 only; 3.4 seems to work fine.
>
> I tracked the root cause down to a regression in 3.5. ZOOKEEPER-1506 made
> an improvement 3.4 that wasn't ported to 3.5. I opened ZOOKEEPER-2982 to
> track this, and have a PR ready. Could we shoot to get the fix into 3.5.4?
>
> Thanks,
> Eron Wright