Hi team / Enrico, I’d like to get feedback from the community on the following patch (moving the discussion from GitHub to here):
https://issues.apache.org/jira/browse/ZOOKEEPER-3204 <https://issues.apache.org/jira/browse/ZOOKEEPER-3204> https://github.com/apache/zookeeper/pull/753 <https://github.com/apache/zookeeper/pull/753> In a nutshell: looks like that Netty 3.10 is broken under Java 11: it doesn’t properly close the underlying socket (probably not closing the registered NIO selectors) and reconfig tests are unable to re-bind the ports. This problem is similar that we already fixed in NIO with the following patch: https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2 <https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2> The problem doesn’t show up on trunk which has been recently upgraded to Netty 4. Repro: - Start embedded ZK, stop it and try to restart on the same port, or - Start normal ensemble and reconfig to use different (client) port. Then reconfig back to the original port which should fail. (that’s the scenario which is covered in ReconfigTest) I created the above patch (#753) to backport Netty 4 upgrade to 3.5 and it fixes the problem with Java 11 (it doesn’t cause regression in the pre-commit build either), but Enrico is having concerns about making such big change before the release. I tend to agree, but let’s see what are the options. Thoughts: - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is critical. - Can we fix Netty 3? - Maybe. Let’s say we find the bug in Netty 3, what can we do? a) We cannot workaround from ZooKeeper itself and have to submit a pull request for Netty. I think it’s quite unlikely that they will accept the change given it’s not a security bug, but even if they did, only the upgraded version of Netty 3 would work properly with ZooKeeper. Err. b) We can workaround it from ZooKeeper: that could be option #1, but I have a strong feeling about it’s not going to be the case. - Shall we upgrade to Netty 4? - this is option #2 Please share your thoughts, maybe you know about an option #3. Regards, Andor
