Build failed in Hudson: ZooKeeper-trunk #25
See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/25/changes Changes: [phunt] Moved the cobertura taskdef into the cobertura target - the regular targets always work now and the cobertura targets will work if the coberbura jars are available -- [...truncated 32248 lines...] [junit] 2008-07-17 12:20:05,526 - INFO [SendThread:[EMAIL PROTECTED] - Attempting connection to server /127.0.0.1:33299 [junit] 2008-07-17 12:20:05,527 - INFO [SendThread:[EMAIL PROTECTED] - Priming connection to java.nio.channels.SocketChannel[connected local=/127.0.0.1:48957 remote=/127.0.0.1:33299] [junit] 2008-07-17 12:20:05,528 - WARN [NIOServerCxn.Factory:[EMAIL PROTECTED] - Connected to /127.0.0.1:48957 lastZxid 0 [junit] 2008-07-17 12:20:05,528 - WARN [NIOServerCxn.Factory:[EMAIL PROTECTED] - Finished init of 11b30f66928: false [junit] 2008-07-17 12:20:05,529 - WARN [NIOServerCxn.Factory:[EMAIL PROTECTED] - Renewing session 11b30f66928 [junit] 2008-07-17 12:20:05,530 - WARN [SendThread:[EMAIL PROTECTED] - Closing: [junit] java.io.IOException: Session Expired [junit] at org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:414) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:499) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:712) [junit] 2008-07-17 12:20:06,237 - INFO [SendThread:[EMAIL PROTECTED] - Attempting connection to server /127.0.0.1:33299 [junit] 2008-07-17 12:20:06,238 - INFO [SendThread:[EMAIL PROTECTED] - Priming connection to java.nio.channels.SocketChannel[connected local=/127.0.0.1:48958 remote=/127.0.0.1:33299] [junit] 2008-07-17 12:20:06,240 - WARN [NIOServerCxn.Factory:[EMAIL PROTECTED] - Connected to /127.0.0.1:48958 lastZxid 0 [junit] 2008-07-17 12:20:06,241 - WARN [NIOServerCxn.Factory:[EMAIL PROTECTED] - Creating new session 11b30f67a06 [junit] 2008-07-17 12:20:06,267 - WARN [SyncThread:[EMAIL PROTECTED] - Finished init of 11b30f67a06: true [junit] zk with session id 79711253576417280 was destroyed! [junit] zk with session id 79711253576417280 was created! [junit] 2008-07-17 12:20:06,286 - INFO [SendThread:[EMAIL PROTECTED] - Attempting connection to server /127.0.0.1:33299 [junit] 2008-07-17 12:20:06,287 - INFO [SendThread:[EMAIL PROTECTED] - Priming connection to java.nio.channels.SocketChannel[connected local=/127.0.0.1:48959 remote=/127.0.0.1:33299] [junit] 2008-07-17 12:20:06,289 - WARN [NIOServerCxn.Factory:[EMAIL PROTECTED] - Connected to /127.0.0.1:48959 lastZxid 0 [junit] 2008-07-17 12:20:06,289 - WARN [NIOServerCxn.Factory:[EMAIL PROTECTED] - Finished init of 11b30f67a06: true [junit] 2008-07-17 12:20:06,290 - WARN [NIOServerCxn.Factory:[EMAIL PROTECTED] - Renewing session 11b30f67a06 [junit] After get data /e [junit] 2008-07-17 12:20:06,298 - INFO [ProcessThread:[EMAIL PROTECTED] - Processed session termination request for id: 11b30f67a06 [junit] 2008-07-17 12:20:06,311 - WARN [SendThread:[EMAIL PROTECTED] - Closing: [junit] java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0 lim=4 cap=4] [junit] at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:491) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:712) [junit] 2008-07-17 12:20:16,257 - ERROR [NIOServerCxn.Factory:[EMAIL PROTECTED] - FIXMSG [junit] java.io.IOException: Missing session 11b30f67a06 [junit] at org.apache.zookeeper.server.ZooKeeperServer.touch(ZooKeeperServer.java:697) [junit] at org.apache.zookeeper.server.ZooKeeperServer.submitRequest(ZooKeeperServer.java:867) [junit] at org.apache.zookeeper.server.NIOServerCnxn.readRequest(NIOServerCnxn.java:438) [junit] at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:293) [junit] at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:149) [junit] 2008-07-17 12:20:16,427 - INFO [SendThread:[EMAIL PROTECTED] - Attempting connection to server /127.0.0.1:33299 [junit] 2008-07-17 12:20:16,427 - INFO [SendThread:[EMAIL PROTECTED] - Priming connection to java.nio.channels.SocketChannel[connected local=/127.0.0.1:48960 remote=/127.0.0.1:33299] [junit] 2008-07-17 12:20:16,428 - WARN [NIOServerCxn.Factory:[EMAIL PROTECTED] - Connected to /127.0.0.1:48960 lastZxid 0 [junit] 2008-07-17 12:20:16,429 - WARN [NIOServerCxn.Factory:[EMAIL PROTECTED] - Creating new session 11b30f67a060001 [junit] 2008-07-17 12:20:16,447 - WARN [SyncThread:[EMAIL PROTECTED] - Finished init of 11b30f67a060001: true [junit] before close zk with session id 79711253576417281! [junit] 2008-07-17 12:20:16,449 - INFO [ProcessThread:[EMAIL PROTECTED] - Processed session termination request for id:
[jira] Updated: (ZOOKEEPER-78) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation
[ https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] james strachan updated ZOOKEEPER-78: Attachment: writeLock_protocol.patch added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation --- Key: ZOOKEEPER-78 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-78 Project: Zookeeper Issue Type: New Feature Components: java client Affects Versions: 3.0.0 Reporter: james strachan Attachments: writeLock_protocol.patch Here's a patch which adds a little WriteLock helper class for performing leader elections or creating exclusive locks in some directory znode. Note its an early cut; am sure we can improve it over time. The aim is to avoid folks having to use the low level ZK stuff but provide a simpler high level abstraction. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-79) Document jacob's leader election on the wiki recipes page
[ https://issues.apache.org/jira/browse/ZOOKEEPER-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12614448#action_12614448 ] james strachan commented on ZOOKEEPER-79: - Ah cool :) Was just checking we were not about to do the same thing separate :). I've basically followed the same algorithm from the wiki recipe - and the same one described in the ZooKeeper tutorial... http://developer.yahoo.com/blogs/hadoop/2008/03/intro-to-zookeeper-video.html So AFAIK yes its the same Document jacob's leader election on the wiki recipes page - Key: ZOOKEEPER-79 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-79 Project: Zookeeper Issue Type: New Feature Components: documentation Reporter: Patrick Hunt Assignee: Patrick Hunt The following discussion occurred on the zookeeper-user list. We need to formalize this recipe and document on the wiki recipes page: -from jacob Avinash The following protocol will help you fix the observed misbehavior. As Flavio points out, you cannot rely on the order of nodes in getChildren, you must use an intrinsic property of each node to determine who is the leader. The protocol devised by Runping Qi and described here will do that. First of all, when you create child nodes of the node that holds the leadership bids, you must create them with the EPHEMERAL and SEQUENCE flag. ZooKeeper guarantees to give you an ephemeral node named uniquely and with a sequence number larger by at least one than any previously created node in the sequence. You provide a prefix, like L_ or your own choice, and ZooKeeper creates nodes named L_23, L_24, etc. The sequence number starts at 0 and increases monotonously. Once you've placed your leadership bid, you search backwards from the sequence number of *your* node to see if there are any preceding (in terms of the sequence number) nodes. When you find one, you place a watch on it and wait for it to disappear. When you get the watch notification, you search again, until you do not find a preceding node, then you know you're the leader. This protocol guarantees that there is at any time only one node that thinks it is the leader. But it does not disseminate information about who is the leader. If you want everyone to know who is the leader, you can have an additional Znode whose value is the name of the current leader (or some identifying information on how to contact the leader, etc.). Note that this cannot be done atomically, so by the time other nodes find out who the leader is, the leadership may already have passed on to a different node. Flavio Might it make sense to provide a standardized implementation of leader election in the library code in Java? --Jacob From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Flavio Junqueira Sent: Friday, July 11, 2008 1:02 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [Zookeeper-user] Leader election Hi Avinash, getChildren returns a list in lexicographic order, so if you are updating the children of the election node concurrently, then you may get a different first node with different clients. If you are using the sequence flag to create nodes, then you may consider stripping the prefix of the node name and using the sufix value to determine order. Hope it helps. -Flavio - Original Message From: Avinash Lakshman [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, July 11, 2008 7:20:06 AM Subject: [Zookeeper-user] Leader election Hi I am trying to elect leader among 50 nodes. There is always one odd guy who seems to think that someone else distinct from what some other nodes see as leader. Could someone please tell me what is wrong with the following code for leader election: public void electLeader() { ZooKeeper zk = StorageService.instance().getZooKeeperHandle(); String path = /Leader; try { String createPath = path + /L-; LeaderElector.createLock_.lock(); while( true ) { /* Get all znodes under the Leader znode */ ListString values = zk.getChildren(path, false); /* * Get the first znode and if it is the * pathCreated created above then the data * in that znode is the leader's identity. */ if ( leader_ == null ) { leader_ = new AtomicReferenceEndPoint( EndPoint.fromBytes( zk.getData(path + / +
[jira] Commented: (ZOOKEEPER-80) Document process for client recipe contributions
[ https://issues.apache.org/jira/browse/ZOOKEEPER-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12614466#action_12614466 ] Doug Cutting commented on ZOOKEEPER-80: --- Whoa! Big issue description! Perhaps you could have gone with a link to the mail archive? Descriptions are included in every message about the issue... In any case, I think perhaps each recipe deserves its own code-tree, and should hence be a separate contrib module. Perhaps, instead of 'contrib/' these should just be under 'recipes/', with a separate src/, lib/, doc/, build.xml, README.txt, etc. for each? Multiple language implementations would go in different src/ subdirectories. Does that work? Also, I am -1 for making these subversion-only. Only released software is fully covered by the Apache license. Subversion is for internal exchange by Apache of works-in-progress, not for end users. Document process for client recipe contributions Key: ZOOKEEPER-80 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-80 Project: Zookeeper Issue Type: Task Components: documentation Reporter: Patrick Hunt Assignee: Patrick Hunt How do we accept zk client recipe contributions? Initiated by the following discussion on the mailing list: -- ben reed wrote Excellent proposal. The only thing I would add is that there should be an english description of the recipe in subversion. That way if someone wanted to do a compatible binding they can do it. If the recipe is on the wiki it would be hard to keep it in sync, so it is important that it is in subversion. My preference would be that the doc would be in the same contrib subdirectory as the source for ease of maintenance. ben Patrick Hunt wrote: James, thanks for the contribution! Tests and everything. :-) Jacob sent some mail to the list recently (attached) that details a protocol that he's used successfully (and picked up by some zk users). I have a todo item to document this protocol on the recipes wiki page, haven't gotten to it yet. Not sure how/if this matches what you've done but we should sync up (also see below). https://issues.apache.org/jira/browse/ZOOKEEPER-79 There has been some discussion on client side helper code in the past however this is the first contribution. We need to make some decisions and outline what/how we will accept. 1) I think we should have a contrib/recipes/{java/{main,test}/org/apache/zookeeper/... ,c/,...} hierarchy for contributions that implement recipes, including any helper code 2) We should first document recipes on the wiki, then implement them in the code http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperRecipes The code should fully document the api/implementation, and refer to wiki page for protocol specifics. 3) What should we do relative to ZK releases. Are recipes included in a release? Will bugs in recipes hold up a release? My initial thought is that contrib is available through svn, but not included in the release. If users want to access/use this code they will be required to checkout/build themselves. (at least initially) 4) We will not require parody btw the various client languages. Currently we support Java/C clients, we will be adding various scripting languages soon. Contributions will be submitted for various clients (James' submission is for java), that will be placed into contrib, if someone else contributes C bindings (etc...) we will add those to contrib/recipes as well. 5) Implementations should strive to implement similar recipe protocols (see 2 above, a good reason to document before implement). There may be multiple, different, protocols each with their own implementation, but for a particular protocol the implementations should be the same. We may want to stress 5 even more - if multiple clients implementations (c/java/...) are participating in a single instance of leader election it will be CRITICAL for them to be inter-operable. Comments, questions, suggestion? Patrick James Strachan wrote: So having recently discovered ZooKeeper, I'm really liking it - good job folks! I've seen discussions of building high level features from the core ZK library and had not seen any available on the interweb so figured I'd have a try creating a simple one. Feel free to ignore it if a ZK ninja can think of a neater way of doing it - I've basically followed the protocol defined in the recent ZK presentation... http://developer.yahoo.com/blogs/hadoop/2008/03/intro-to-zookeeper-video.html I've submitted the code as a patch here... https://issues.apache.org/jira/browse/ZOOKEEPER-78 I
Re: Recipe contrib -- was Re: [PATCH] a simple Leader Election or exclusive Write Lock protocol/policy
I was thinking all the recipe implementations are in a single jar. zookeeper-recipes.jar otw it's going to be a pain to carry around the right jars - esp if one recipe depends on another. We saw similar issue with Pig UDF contribs (originally there were multiple jars). Patrick Doug Cutting wrote: Patrick Hunt wrote: should we have separate contrib subdirs, one for each recipe or all recipes together? What about shared code, common code for implementing a recipe? Seems a little too much to separate them all at the top level, rather than separating them in packages Do you want to package them together, in a single jar, or separately, each in their own jar? A tree per jar is standard. Doug
Re: Recipe contrib -- was Re: [PATCH] a simple Leader Election or exclusive Write Lock protocol/policy
Doug Cutting wrote: Patrick Hunt wrote: my original intent was to have it as zookeeper/trunk/src/contrib/... where this directory will have not only recipes but also other contributions. +1 That's what I meant too. And since you've said you want to package these all as a single jar, then there should just be a single tree per language, right? Yes, that's right. zookeeper/trunk/src/contrib/recipes/src/java/org/apache/zookeeper/recipes/locking/ zookeeper/trunk/src/contrib/recipes/src/c++/locking/ Right. I don't see the strong connection between releases and new features. Am I missing something? I think that the issue is implied connotation of contrib. We were thinking of contrib as interesting stuff that users might want to re-use/share but not core to zookeeper, we'll carry it around so that users know where to find it, if it gets broken and no one fixes it it won't hold up the release If I understand correctly you are saying there is no middle ground. Patrick
Re: Recipe contrib -- was Re: [PATCH] a simple Leader Election or exclusive Write Lock protocol/policy
Patrick Hunt wrote: We were thinking of contrib as interesting stuff that users might want to re-use/share but not core to zookeeper, we'll carry it around so that users know where to find it, if it gets broken and no one fixes it it won't hold up the release That's a fine policy for contrib, roughly what Lucene, Hadoop etc. have. The QA, documentation, etc. requirements for contrib are frequently lower. The IP requirements however are equal, since Apache's distributing the code. Doug
[jira] Updated: (ZOOKEEPER-80) Document process for client recipe contributions
[ https://issues.apache.org/jira/browse/ZOOKEEPER-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-80: -- Description: Per Doug's suggestion I'll use a link instead of copy/paste: http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-dev/200807.mbox/[EMAIL PROTECTED] was: How do we accept zk client recipe contributions? Initiated by the following discussion on the mailing list: -- ben reed wrote Excellent proposal. The only thing I would add is that there should be an english description of the recipe in subversion. That way if someone wanted to do a compatible binding they can do it. If the recipe is on the wiki it would be hard to keep it in sync, so it is important that it is in subversion. My preference would be that the doc would be in the same contrib subdirectory as the source for ease of maintenance. ben Patrick Hunt wrote: James, thanks for the contribution! Tests and everything. :-) Jacob sent some mail to the list recently (attached) that details a protocol that he's used successfully (and picked up by some zk users). I have a todo item to document this protocol on the recipes wiki page, haven't gotten to it yet. Not sure how/if this matches what you've done but we should sync up (also see below). https://issues.apache.org/jira/browse/ZOOKEEPER-79 There has been some discussion on client side helper code in the past however this is the first contribution. We need to make some decisions and outline what/how we will accept. 1) I think we should have a contrib/recipes/{java/{main,test}/org/apache/zookeeper/... ,c/,...} hierarchy for contributions that implement recipes, including any helper code 2) We should first document recipes on the wiki, then implement them in the code http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperRecipes The code should fully document the api/implementation, and refer to wiki page for protocol specifics. 3) What should we do relative to ZK releases. Are recipes included in a release? Will bugs in recipes hold up a release? My initial thought is that contrib is available through svn, but not included in the release. If users want to access/use this code they will be required to checkout/build themselves. (at least initially) 4) We will not require parody btw the various client languages. Currently we support Java/C clients, we will be adding various scripting languages soon. Contributions will be submitted for various clients (James' submission is for java), that will be placed into contrib, if someone else contributes C bindings (etc...) we will add those to contrib/recipes as well. 5) Implementations should strive to implement similar recipe protocols (see 2 above, a good reason to document before implement). There may be multiple, different, protocols each with their own implementation, but for a particular protocol the implementations should be the same. We may want to stress 5 even more - if multiple clients implementations (c/java/...) are participating in a single instance of leader election it will be CRITICAL for them to be inter-operable. Comments, questions, suggestion? Patrick James Strachan wrote: So having recently discovered ZooKeeper, I'm really liking it - good job folks! I've seen discussions of building high level features from the core ZK library and had not seen any available on the interweb so figured I'd have a try creating a simple one. Feel free to ignore it if a ZK ninja can think of a neater way of doing it - I've basically followed the protocol defined in the recent ZK presentation... http://developer.yahoo.com/blogs/hadoop/2008/03/intro-to-zookeeper-video.html I've submitted the code as a patch here... https://issues.apache.org/jira/browse/ZOOKEEPER-78 I figured the Java Client might as well come with some helper code to make doing things like exclusive locks or leader elections easier; we could always spin them out into a separate library if and when required etc. Right now its one fairly simple class :) Currently its a simple class where you can register a Runnable to be invoked when you have the lock; or you can just keep asking if you have the lock now and again as you see fit etc. WriteLock locker = new WriteLock(zookeeper, /foo/bar); locker.setWhenOwner(new Runnable() {...}); // fire this code when owner... // lets try own it locker.acquire(); // I may or may not have the lock now if (locker.isOwner()) {} // time passes locker.close(); Thoughts? Subject: Re: [Zookeeper-user] Leader election From: Jacob Levy [EMAIL PROTECTED] Date: Fri, 11 Jul 2008 10:42:33 -0700 To: Flavio Junqueira [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL
[jira] Commented: (ZOOKEEPER-38) headers (version+) in log/snap files
[ https://issues.apache.org/jira/browse/ZOOKEEPER-38?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12614540#action_12614540 ] Benjamin Reed commented on ZOOKEEPER-38: I really like how you pulled the code out of the scattered locations around the source tree. I wish you would have pulled them into a couple of classes rather than scattering them into new classes. It seems like it would be really nice to have a Snapshot class and a TxnLog class. I think classes like SerializeUtils, AsyncSnapshotPolicy, Util, FileLogWriter, FileLogProvider, FileDBInfo (perhaps others) could be pulled into these classes. I agree with Mahadev that the extra interfaces seem overkill for the simple requirements of this Jira. The provider classes are also overkill for this Jira. Perhaps in the future we may need something like that, but I'd rather wait for the need than try and foresee a solution now. Mahadev and I will take a crack at collapsing these classes down. headers (version+) in log/snap files Key: ZOOKEEPER-38 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-38 Project: Zookeeper Issue Type: New Feature Components: server Reporter: Patrick Hunt Assignee: Andrew Kornev Fix For: 3.0.0 Attachments: ZOOKEEPER-38.patch, ZOOKEEPER-38.patch, ZOOKEEPER-38.patch Moved from SourceForge to Apache. http://sourceforge.net/tracker/index.php?func=detailaid=1961767group_id=209147atid=1008547 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-78) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation
[ https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12614548#action_12614548 ] Flavio Paiva Junqueira commented on ZOOKEEPER-78: - This is a nice implementation, James. Good job! My two comments are: 1- It might be a good idea to throw the exceptions instead of trying to catch them and retry. You will end up with a cleaner code; 2- I'm not sure if it is necessary to add the log4j configuration. Is there a particular reason for including it or it is there by accident? added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation --- Key: ZOOKEEPER-78 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-78 Project: Zookeeper Issue Type: New Feature Components: java client Affects Versions: 3.0.0 Reporter: james strachan Assignee: james strachan Attachments: writeLock_protocol.patch Here's a patch which adds a little WriteLock helper class for performing leader elections or creating exclusive locks in some directory znode. Note its an early cut; am sure we can improve it over time. The aim is to avoid folks having to use the low level ZK stuff but provide a simpler high level abstraction. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-79) Document jacob's leader election on the wiki recipes page
[ https://issues.apache.org/jira/browse/ZOOKEEPER-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12614556#action_12614556 ] Flavio Paiva Junqueira commented on ZOOKEEPER-79: - This idea of having clients watching on different nodes when waiting for application events is crucial to avoid the herd effect, and it has applications other than leader election. Ben and I, for example, used a similar idea to implement barriers. With a barrier, clients need to wait until all other clients are done with their part of a computation, and we can use the existence of a znode to express the fact that a particular client hasn't finished. Document jacob's leader election on the wiki recipes page - Key: ZOOKEEPER-79 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-79 Project: Zookeeper Issue Type: New Feature Components: documentation Reporter: Patrick Hunt Assignee: Patrick Hunt The following discussion occurred on the zookeeper-user list. We need to formalize this recipe and document on the wiki recipes page: -from jacob Avinash The following protocol will help you fix the observed misbehavior. As Flavio points out, you cannot rely on the order of nodes in getChildren, you must use an intrinsic property of each node to determine who is the leader. The protocol devised by Runping Qi and described here will do that. First of all, when you create child nodes of the node that holds the leadership bids, you must create them with the EPHEMERAL and SEQUENCE flag. ZooKeeper guarantees to give you an ephemeral node named uniquely and with a sequence number larger by at least one than any previously created node in the sequence. You provide a prefix, like L_ or your own choice, and ZooKeeper creates nodes named L_23, L_24, etc. The sequence number starts at 0 and increases monotonously. Once you've placed your leadership bid, you search backwards from the sequence number of *your* node to see if there are any preceding (in terms of the sequence number) nodes. When you find one, you place a watch on it and wait for it to disappear. When you get the watch notification, you search again, until you do not find a preceding node, then you know you're the leader. This protocol guarantees that there is at any time only one node that thinks it is the leader. But it does not disseminate information about who is the leader. If you want everyone to know who is the leader, you can have an additional Znode whose value is the name of the current leader (or some identifying information on how to contact the leader, etc.). Note that this cannot be done atomically, so by the time other nodes find out who the leader is, the leadership may already have passed on to a different node. Flavio Might it make sense to provide a standardized implementation of leader election in the library code in Java? --Jacob From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Flavio Junqueira Sent: Friday, July 11, 2008 1:02 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [Zookeeper-user] Leader election Hi Avinash, getChildren returns a list in lexicographic order, so if you are updating the children of the election node concurrently, then you may get a different first node with different clients. If you are using the sequence flag to create nodes, then you may consider stripping the prefix of the node name and using the sufix value to determine order. Hope it helps. -Flavio - Original Message From: Avinash Lakshman [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, July 11, 2008 7:20:06 AM Subject: [Zookeeper-user] Leader election Hi I am trying to elect leader among 50 nodes. There is always one odd guy who seems to think that someone else distinct from what some other nodes see as leader. Could someone please tell me what is wrong with the following code for leader election: public void electLeader() { ZooKeeper zk = StorageService.instance().getZooKeeperHandle(); String path = /Leader; try { String createPath = path + /L-; LeaderElector.createLock_.lock(); while( true ) { /* Get all znodes under the Leader znode */ ListString values = zk.getChildren(path, false); /* * Get the first znode and if it is the * pathCreated created above then the data * in that znode is the leader's identity. */