Hi Christian, ZK-107 would indeed allow you to add/remove servers and change their addresses.
> We could ensure that we always have a more or less fixed quorum of zookeeper servers with a fixed set of host names. You should probably also ensure that a majority of the old ensemble intersects with a majority of the new one. Otherwise you have to run a reconfiguration protocol similarly to ZK-107. For example, if you have 3 servers A B and C, and now you're adding D and E that replace B and C, how would this work ? it is probable that D and E don't have the latest state (as you mention) and A is down or doesn't have the latest state too (a minority might not have the latest state). Also, how do you prevent split brain in this case ? meaning B and C thinking that they are still operational ? perhaps I'm missing something but I suspect that the change you propose won't be enough... Best Regards, Alex On Wed, Mar 14, 2012 at 10:01 AM, Christian Ziech <[email protected] > wrote: > Just a small addition: In my opinion the patch could really boil down to > add a > > quorumServer.electionAddr = new > InetSocketAddress(**electionAddr.getHostName(), electionAddr.getPort()); > > in the catch(IOException e) clause of the connectOne() method of the > QuorumCnxManager. In addition on should perhaps make the electionAddr field > in the QuorumPeer.QuorumServer class volatile to prevent races. > > I haven't checked this change yet fully for implications but doing a quick > test on some machines at least showed it would solve our use case. What do > the more expert users / maintainers think - is it even worthwhile to go > that route? > > Am 14.03.2012 17:04, schrieb ext Christian Ziech: > > LEt me describe our upcoming use case in a few words: We are planning to >> use zookeeper in a cloud were typically nodes come and go unpredictably. We >> could ensure that we always have a more or less fixed quorum of zookeeper >> servers with a fixed set of host names. However the IPs associated with the >> host names would change every time a new server comes up. I browsed the >> code a little and it seems right now that the only problem is that the >> zookeeper server is remembering the resolved InetSocketAddress in its >> QuorumPeer hash map. >> >> I saw that possibly ZOOKEEPER-107 would also solve that problem but >> possibly in a more generic way than actually needed (perhaps here I >> underestimate the impact of joining as a server with an empty data >> directory to replace a server that previously had one). >> >> Given that - from looking at ZOOKEEPER-107 - it seems that it will still >> take some time for the proposed fix to make it into a release, would it >> make sense to invest time into a smaller work fix just for this "replacing >> a dropped server without rolling restarts" use case? Would there be a >> chance that a fix for this makes it into the 3.4.x branch? >> >> Are there perhaps other ways to get this use case supported without the >> need for doing rolling restarts whenever we need to replace one of the >> zookeeper servers? >> >> > > -- > *NOKIA* > *Christian Ziech* > Senior Software Developer > Context Based Services > Services & Software > Mobile: +4915155155740 > Fax: +493044676555 > eMail: [email protected] > Nokia gate5 GmbH > Invalidenstr. 117 > 10115 Berlin, Germany > www.maps.nokia.com <http://www.maps.nokia.com> > www.smart2go.com <http://www.smart2go.com> > > Nokia gate5 GmbH, Sitz der Gesellschaft: Berlin, Amtsgericht > Charlottenburg: HRB 106443 B, Steuernr.: 37/222/20817, ID/VAT-Nr.: DE 812 > 845 193, Geschäftsführer: Dr. Michael Halbherr, Karim Tähtivuori >
