Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by Flavio Junqueira: http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperRecipes The comment on the change is: . ------------------------------------------------------------------------------ To solve the first problem, we can have only the coordinator being notified of changes to the transaction nodes, and then notifying the sites once it reaches a decision. Note that this approach, although more scalable, is slower as it requires all communication to go through the coordinator. For the second problem, we can have the coordinator propagating the transaction to the sites, and having each site creating its own ephemeral node. + == Leader election == + A simple way of doing leader election with ZooKeeper is to use the SEQUENCE|EPHEMERAL flags when creating znodes that represent "proposals" of clients. The idea is to have a znode, say "''/election''", such that each znode creates a child znode "''/election/n_''" with both flags SEQUENCE|EPHEMERAL. With the sequence flag, ZooKeeper automatically appends a sequence number that is greater that any one previously appended to a child of "/election". The process that created the znode with the smallest appended sequence number is the leader. + + That's not all, though. It is important to watch for failures of the leader, so that a new client arises as the new leader in the case the current leader fails. A trivial solution is to have all application processes watching upon the current smallest znode, and checking if they are the new leader when the smallest znode goes away (note that the smallest znode will go away if the leader fails because the node is ephemeral). This causes what we call "the herd effect": upon of failure of the current leader, all other processes receive a notification, and execute ''getChildren'' on "''/election''" to obtain the current list of children of "''/election''". If the number of application clients is large, then it causes a spike on the number of operations that ZooKeeper servers have to process. To avoid the herd effect, it is sufficient to watch for the next znode down on the sequence of znodes. If a client receives a notification that the znode it is watching upon is gone, then i t becomes the new leader in the case that there is no smaller znode. Note that this avoids the herd effect by not having all clients watching upon the same znode. + + Let ''ELECTION'' be a path of choice of the application. To volunteer to be a leader: + 1. Create znode ''z'' with path "''ELECTION/n_''" with both '''SEQUENCE''' and '''EPHEMERAL''' flags; + 1. Let ''C'' be the children of "''ELECTION''", and ''i'' be the sequence number of ''z''; + 1. Watch for changes on "''ELECTION/n_j''", where ''j'' is the smallest sequence number such that ''j < i'' and ''n_j'' is a znode in ''C''; + + Upon receiving a notification of znode deletion: + 1. Let ''C'' be the new set of children of ''ELECTION''; + 1. If ''z'' is the smallest node in ''C'', then execute leader procedure; + 1. Otherwise, watch for changes on "''ELECTION/n_j''", where ''j'' is the smallest sequence number such that ''j < i'' and ''n_j'' is a znode in ''C''; +
