I have a use case where I dynamically grow a zookeeper ensemble on the same fixed set of machines multiple times. In each iteration, the ensemble is grown incrementally till it consists of "n" servers. I will refer to the machines hosting the servers as zk-1, zk-2, ..., zk-n.
At the beginning of each iteration, I wipe out the zookeeper data directories of zk-1 and zk-2, then statically configure the zookeeper servers on both of them to form a 2-way ensemble. After that, I start growing the ensemble incrementally by reconfiguring the zookeeper ensemble to include zk-i, then clearing, configure and starting the zookeeper server on zk-i (that is for i in range(2,n)). I was not shutting down or cleaning up the previous ensemble zookeeper servers at the end of each iteration. After initializing the 2-way ensemble on zk-1 and zk-2, I observed that the servers from the old deployment were contacting the servers of the new ensemble and triggering an ensemble reconfiguration. A quick look at the code seems to suggest that this is simply triggered by the virtue that the config version value of the old deployment server is higher than that of that found on the new ensemble servers. Can anyone confirm my understanding of this behaviour of zookeeper? I also noticed that his reconfiguration holds true for n=3. For example lets say zookeeper servers on zk-1 and zk-2 are freshly configured to form a 2-way ensemble, and zk-3 contains a leftover server that was part of an older 3-way ensemble (that included two obselete servers on zk-1 and zk-2). To me it seems a bit counter intuitive for one server (on zk-3) to drive the configuration of two other servers (zk1, zk2). The reason why it seems counter intuitive is that the majority of the servers in the ensemble agree on a different config version. Can somebody explain how zookeeper handles this situation? One final note, it would be really useful if a zookeeper ensemble would have a unique identifier that could be set in the "zoo.cfg" file. Whenever servers communicate witch each other, they would verify that they are talking to peers of the same ensemble before commencing with further actions. Does that sound like a reasonable request? Thanks, -- Mohammad Shamma
