Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The following page has been changed by Flavio Junqueira:
http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperRecipes

The comment on the change is:
. 

------------------------------------------------------------------------------
  
  To solve the first problem, we can have only the coordinator being notified 
of changes to the transaction nodes, and then notifying the sites once it 
reaches a decision. Note that this approach, although more scalable, is slower 
as it requires all communication to go through the coordinator. For the second 
problem, we can have the coordinator propagating the transaction to the sites, 
and having each site creating its own ephemeral node.
  
+ == Leader election ==
+ A simple way of doing leader election with ZooKeeper is to use the 
SEQUENCE|EPHEMERAL flags when creating znodes that represent "proposals" of 
clients. The idea is to have a znode, say "''/election''", such that each znode 
creates a child znode "''/election/n_''" with both flags SEQUENCE|EPHEMERAL. 
With the sequence flag, ZooKeeper automatically appends a sequence number that 
is greater that any one previously appended to a child of "/election". The 
process that created the znode with the smallest appended sequence number is 
the leader. 
+ 
+ That's not all, though. It is important to watch for failures of the leader, 
so that a new client arises as the new leader in the case the current leader 
fails. A trivial solution is to have all application processes watching upon 
the current smallest znode, and checking if they are the new leader when the 
smallest znode goes away (note that the smallest znode will go away if the 
leader fails because the node is ephemeral). This causes what we call "the herd 
effect": upon of failure of the current leader, all other processes receive a 
notification, and execute ''getChildren'' on "''/election''" to obtain the 
current list of children of "''/election''". If the number of application 
clients is large, then it causes a spike on the number of operations that 
ZooKeeper servers have to process. To avoid the herd effect, it is sufficient 
to watch for the next znode down on the sequence of znodes. If a client 
receives a notification that the znode it is watching upon is gone, then i
 t becomes the new leader in the case that there is no smaller znode. Note that 
this avoids the herd effect by not having all clients watching upon the same 
znode.
+ 
+ Let ''ELECTION'' be a path of choice of the application. To volunteer to be a 
leader:
+  1. Create znode ''z'' with path "''ELECTION/n_''" with both '''SEQUENCE''' 
and '''EPHEMERAL''' flags;
+  1. Let ''C'' be the children of "''ELECTION''", and ''i'' be the sequence 
number of ''z'';
+  1. Watch for changes on "''ELECTION/n_j''", where ''j'' is the smallest 
sequence number such that ''j < i'' and ''n_j'' is a znode in ''C'';
+ 
+ Upon receiving a notification of znode deletion:
+  1. Let ''C'' be the new set of children of ''ELECTION'';
+  1. If ''z'' is the smallest node in ''C'', then execute leader procedure;
+  1. Otherwise, watch for changes on "''ELECTION/n_j''", where ''j'' is the 
smallest sequence number such that ''j < i'' and ''n_j'' is a znode in ''C'';   
    
+ 

Reply via email to