[ https://issues.apache.org/jira/browse/ZOOKEEPER-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012561#comment-13012561 ]
Vishal K commented on ZOOKEEPER-107: ------------------------------------ {quote} We were discussing this point with Flavio a bit, and there are some initial ideas. In any case we need some sort of DNS as a fallback, for clients that were disconnected during the reconfiguration - when they wake up there might no longer be anyone from M alive. {quote} servers will need it too. A peer may be part of M', but may not know about it. The mechanism for this should be internal to the cluster. If this requires making entries in DNS, then in practice no administrator will allow that. To run something internal to the cluster, you will need a fail safe way to access that central resource. As we discussed, using cluster IP (ZOOKEEPER-1031) and running the resource on the node with cluster ip will help in this scenario. We shouldn't rely on the ip to be running to bootsrap the cluster (to form a quorum) for obvious reasons. {quote} process operations in the new configuration M' (otherwise we may get split-brain). {quote} Earlier when I had implemented a membership change algorithm, I rejected reconfig(M') if a majority of M did not belong to majority of M'. In short, change at most a majority - 1 number of nodes in one reconfig. It is not a strict requirement here, but I think it is worth a thought. {quote} Because of FIFO and since M' are connected as followers from the beginning of the reconfiguration, {quote} Ah! this was the missing link. So M' are considered as followers. I missed that in our description on the twiki. From "New operations (received by leader(M) during phase-1) will be sent to both M and M'" it looked like the leader is just sending messages to M', but not waiting for a majority to ack. {quote} Having said that, we might want to commit the transactions in M' if we want to transfer clients to M' gradually, as suggested by Flavio. {quote} Thats a good point. One simple way to do this is to let s use incremental API if there are too many clients. Replacing one by one will result in gradual transfer of clients. Also, after step 7 of the algorithm (before phase 2) how about we ask leader(M) to do a transactions using zab (or consider reconfig-start as that transaction) to write out M' in zookeeper tree. Each server can then send a notification to its clients to indicate a change in progress. Clients can disconnect from M and start connecting to M'. {quote} Here, I don't see the difference from a normal execution without reconfigurations. {quote} Agreed. Thanks. > Allow dynamic changes to server cluster membership > -------------------------------------------------- > > Key: ZOOKEEPER-107 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-107 > Project: ZooKeeper > Issue Type: Improvement > Components: server > Reporter: Patrick Hunt > Assignee: Henry Robinson > Attachments: SimpleAddition.rtf > > > Currently cluster membership is statically defined, adding/removing hosts > to/from the server cluster dynamically needs to be supported. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira