LEt me describe our upcoming use case in a few words: We are planning to use zookeeper in a cloud were typically nodes come and go unpredictably. We could ensure that we always have a more or less fixed quorum of zookeeper servers with a fixed set of host names. However the IPs associated with the host names would change every time a new server comes up. I browsed the code a little and it seems right now that the only problem is that the zookeeper server is remembering the resolved InetSocketAddress in its QuorumPeer hash map.

I saw that possibly ZOOKEEPER-107 would also solve that problem but possibly in a more generic way than actually needed (perhaps here I underestimate the impact of joining as a server with an empty data directory to replace a server that previously had one).

Given that - from looking at ZOOKEEPER-107 - it seems that it will still take some time for the proposed fix to make it into a release, would it make sense to invest time into a smaller work fix just for this "replacing a dropped server without rolling restarts" use case? Would there be a chance that a fix for this makes it into the 3.4.x branch?

Are there perhaps other ways to get this use case supported without the need for doing rolling restarts whenever we need to replace one of the zookeeper servers?

Reply via email to