symat opened a new pull request #1251: ZOOKEEPER-3720: Fix rolling upgrade failure (invalid protocol version) URL: https://github.com/apache/zookeeper/pull/1251 The multi-address feature introduced in ZOOKEEPER-3188 required changes in the Quorum protocol as we had to send all addresses in the connection initiation message to enable the receiving side to choose a reachable address in case of network failure. The new code can handle both the old and the new protocol versions to avoid 'invalid protocol' error e.g. during rolling restarts. However, the new protocol version still can not be used during rolling upgrade if the old servers are not supporting this protocol. In this case the old and the new servers would form two distinct partitions until all the servers get upgraded. To support rolling upgrades too, we want to disable the MultiAddress feature by default and use the old protocol. If the user would like enable the MultiAddress feature on a 3.6.0 cluster, she/he can do it either by 1) starting the cluster from scratch (without rolling upgrade), or 2) performing a rolling upgrade without the MultiAddress feature enabled then doing a rolling restart with a new configuration where the MultiAddress feature is enabled. During the rolling restart there will be no partitions, as all the servers in the cluster now will run ZooKeeper version 3.6.0 which understands now both protocols. The changes in this patch: - introducing new config property: multiAddress.enabled, disabled by default - updating QuorumCnxManager to be able to use both protocol versions and to use the old one if MultiAddress is disabled - failing with ConfigException if the user provides multiple addresses in the config while having MultiAddress disabled - updating the existing MultiAddress related tests to enable the feature first - add some new tests - update the documentation Testing: - I ran all the unit tests - Using https://github.com/symat/zk-rolling-upgrade-test - I tested rolling upgrade from 3.5.6 - I tested rolling restart to enable the MultiAddress feature - Using https://github.com/symat/zookeeper-docker-test - I tested the MultiAddress feature by disabling some virtual interfaces and waiting for the cluster to recover
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
