Github user ivmaykov commented on the issue: https://github.com/apache/zookeeper/pull/184 I think as long as we keep `portUnification=false` there should not be hangs/crashes. However, that means it's not possible to safely upgrade a cluster from plaintext quorum to TLS quorum without downtime. @hanm mentioned "mitigations" but there really isn't a way to mitigate the issues w/ `UnifiedServerSocket` in #184 (other than "don't use it"). One option is to keep the unified server socket code, but don't parse the `portUnification` option in `QuorumPeerConfig` so there is no way to use the feature. Or we could document the issues and have a clear warning ("portUnification has known problems and may cause your ensemble to enter a bad state that requires reboots, use at your own risk"), and let people take the risk if they like. One issue that we found is the >10% perf regression in plaintext mode when the apache httpclient library dependency is added. We never figured out why it caused the perf regression, but it could be a potential blocker. Unlike the port unification bugs, this perf regression cannot be worked around - it was present even when all the SSL features were disabled. The fix for that is small and isolated to one file, so it would be pretty easy to backport to #184 if desired. Let me just do another pass over the differences between the two PRs and see if anything else jumps out at me.
---