javadoc for the Write Lock / Leader Election
The other thread was already quite big and covering a large range of issues so thought I'd spin up a little separate thread :) I've just updated the patch to include better javadoc which is linked to an embedded HTML documentation describing the protocol. The documention includes the pseudocode from the online ZooKeeper presentation (that I used) and I've also included the text from ZOOKEEPER-79 which I'm glad to say seems to match up perfectly with the pseudocode I'd used :) https://issues.apache.org/jira/browse/ZOOKEEPER-78 One thing confused me though; the last paragraph says... This protocol guarantees that there is at any time only one node that thinks it is the leader. But it does not disseminate information about who is the leader. If you want everyone to know who is the leader, you can have an additional Znode whose value is the name of the current leader (or some identifying information on how to contact the leader, etc.). Note that this cannot be done atomically, so by the time other nodes find out who the leader is, the leadership may already have passed on to a different node. In the current implementation, WriteLock - each znode can know, whenever it attempts to acquire the lock - if it didn't get the lock, who the owner is. I guess this is only true momentarily the split second that the acquire() method is called (i.e. the exact moment the getChildren() is called and the lowest value is found). Or is there some other subtle issue I'm not seeing? I guess we could add a method to WriteLock - if folks wanted - a kinda queryLeader() method where we just use the same algorithm to find who the current leader is - if folks cared. Though am not sure how useful knowing who the leader is :). Though I guess writing the leader's identity to some canonical znode that any other znode can read whenever it wishes is less risky and maybe simpler. -- James --- http://macstrac.blogspot.com/ Open Source Integration http://open.iona.com
javadoc for the Write Lock / Leader Election
The other thread was already quite big and covering a large range of issues so thought I'd spin up a little separate thread :) I've just updated the patch to include better javadoc which is linked to an embedded HTML documentation describing the protocol. The documention includes the pseudocode from the online ZooKeeper presentation (that I used) and I've also included the text from ZOOKEEPER-79 which I'm glad to say seems to match up perfectly with the pseudocode I'd used :) https://issues.apache.org/jira/browse/ZOOKEEPER-78 One thing confused me though; the last paragraph says... This protocol guarantees that there is at any time only one node that thinks it is the leader. But it does not disseminate information about who is the leader. If you want everyone to know who is the leader, you can have an additional Znode whose value is the name of the current leader (or some identifying information on how to contact the leader, etc.). Note that this cannot be done atomically, so by the time other nodes find out who the leader is, the leadership may already have passed on to a different node. In the current implementation, WriteLock - each znode can know, whenever it attempts to acquire the lock - if it didn't get the lock, who the owner is. I guess this is only true momentarily the split second that the acquire() method is called (i.e. the exact moment the getChildren() is called and the lowest value is found). Or is there some other subtle issue I'm not seeing? I guess we could add a method to WriteLock - if folks wanted - a kinda queryLeader() method where we just use the same algorithm to find who the current leader is - if folks cared. Though am not sure how useful knowing who the leader is :). Though I guess writing the leader's identity to some canonical znode that any other znode can read whenever it wishes is less risky and maybe simpler. -- James --- http://macstrac.blogspot.com/ Open Source Integration http://open.iona.com
RE: javadoc for the Write Lock / Leader Election
Hi James, the fact that the client's node has another node n ahead of it the in the sequence order doesn't mean that the owner of n is aware that it is the lock holder or the leader. This is because operations are propagated asynchronously. Also, a getChildren() doesn't guarantee that you have the latest list, and it is possible that another node is at the head of the ordered list of nodes at the moment you read the response of getChildren(). This is because getChildren() will return the local state of one server, while the ensemble of servers is processing or have even already decided upon a change to the list. In the way I understand Jacob's suggestion, a leader client creates a separate node to acknowledge that it is actually aware that it is the leader, and so it is ready to perform the role of a leader. -Flavio -Original Message- One thing confused me though; the last paragraph says... This protocol guarantees that there is at any time only one node that thinks it is the leader. But it does not disseminate information about who is the leader. If you want everyone to know who is the leader, you can have an additional Znode whose value is the name of the current leader (or some identifying information on how to contact the leader, etc.). Note that this cannot be done atomically, so by the time other nodes find out who the leader is, the leadership may already have passed on to a different node. In the current implementation, WriteLock - each znode can know, whenever it attempts to acquire the lock - if it didn't get the lock, who the owner is. I guess this is only true momentarily the split second that the acquire() method is called (i.e. the exact moment the getChildren() is called and the lowest value is found). Or is there some other subtle issue I'm not seeing? I guess we could add a method to WriteLock - if folks wanted - a kinda queryLeader() method where we just use the same algorithm to find who the current leader is - if folks cared. Though am not sure how useful knowing who the leader is :). Though I guess writing the leader's identity to some canonical znode that any other znode can read whenever it wishes is less risky and maybe simpler. -- James --- http://macstrac.blogspot.com/ Open Source Integration http://open.iona.com
Re: javadoc for the Write Lock / Leader Election
Thanks for the clarification. I think it makes lots of sense for the leader to write to some canonical place to advertise itself; if others are interested in knowing if it is the leader 2008/7/18 Flavio Junqueira [EMAIL PROTECTED]: Hi James, the fact that the client's node has another node n ahead of it the in the sequence order doesn't mean that the owner of n is aware that it is the lock holder or the leader. This is because operations are propagated asynchronously. Also, a getChildren() doesn't guarantee that you have the latest list, and it is possible that another node is at the head of the ordered list of nodes at the moment you read the response of getChildren(). This is because getChildren() will return the local state of one server, while the ensemble of servers is processing or have even already decided upon a change to the list. In the way I understand Jacob's suggestion, a leader client creates a separate node to acknowledge that it is actually aware that it is the leader, and so it is ready to perform the role of a leader. -Flavio -Original Message- One thing confused me though; the last paragraph says... This protocol guarantees that there is at any time only one node that thinks it is the leader. But it does not disseminate information about who is the leader. If you want everyone to know who is the leader, you can have an additional Znode whose value is the name of the current leader (or some identifying information on how to contact the leader, etc.). Note that this cannot be done atomically, so by the time other nodes find out who the leader is, the leadership may already have passed on to a different node. In the current implementation, WriteLock - each znode can know, whenever it attempts to acquire the lock - if it didn't get the lock, who the owner is. I guess this is only true momentarily the split second that the acquire() method is called (i.e. the exact moment the getChildren() is called and the lowest value is found). Or is there some other subtle issue I'm not seeing? I guess we could add a method to WriteLock - if folks wanted - a kinda queryLeader() method where we just use the same algorithm to find who the current leader is - if folks cared. Though am not sure how useful knowing who the leader is :). Though I guess writing the leader's identity to some canonical znode that any other znode can read whenever it wishes is less risky and maybe simpler. -- James --- http://macstrac.blogspot.com/ Open Source Integration http://open.iona.com -- James --- http://macstrac.blogspot.com/ Open Source Integration http://open.iona.com
[jira] Updated: (ZOOKEEPER-78) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation
[ https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] james strachan updated ZOOKEEPER-78: Attachment: (was: writeLock_protocol_with_documentation-version2.patch) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation --- Key: ZOOKEEPER-78 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-78 Project: Zookeeper Issue Type: New Feature Components: java client Affects Versions: 3.0.0 Reporter: james strachan Assignee: james strachan Attachments: writeLock_protocol_version3.patch Here's a patch which adds a little WriteLock helper class for performing leader elections or creating exclusive locks in some directory znode. Note its an early cut; am sure we can improve it over time. The aim is to avoid folks having to use the low level ZK stuff but provide a simpler high level abstraction. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-78) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation
[ https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] james strachan updated ZOOKEEPER-78: Attachment: (was: writeLock_protocol.patch) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation --- Key: ZOOKEEPER-78 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-78 Project: Zookeeper Issue Type: New Feature Components: java client Affects Versions: 3.0.0 Reporter: james strachan Assignee: james strachan Attachments: writeLock_protocol_version3.patch Here's a patch which adds a little WriteLock helper class for performing leader elections or creating exclusive locks in some directory znode. Note its an early cut; am sure we can improve it over time. The aim is to avoid folks having to use the low level ZK stuff but provide a simpler high level abstraction. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-78) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation
[ https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12614748#action_12614748 ] james strachan commented on ZOOKEEPER-78: - BTW I just deleted the other 2 patches to avoid confusion; the latest patch includes the previous changes etc added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation --- Key: ZOOKEEPER-78 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-78 Project: Zookeeper Issue Type: New Feature Components: java client Affects Versions: 3.0.0 Reporter: james strachan Assignee: james strachan Attachments: writeLock_protocol_version3.patch Here's a patch which adds a little WriteLock helper class for performing leader elections or creating exclusive locks in some directory znode. Note its an early cut; am sure we can improve it over time. The aim is to avoid folks having to use the low level ZK stuff but provide a simpler high level abstraction. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: An interest in increasing the DI'ness of ZooKeeper?
+1 :) I'm a fellow ActiveMQ hacker too and would love to see ZK included with ActiveMQ. Dependency Injection can really help keep your code simple but leaving it flexible so it can be used in many different ways. Here's some links on DI http://martinfowler.com/articles/injection.html http://www.theserverside.com/tt/articles/article.tss?l=SpringFramework 2008/7/18 Hiram Chirino [EMAIL PROTECTED]: Hi Guys, First off, great project! I think ZooKeeper is a fabulous idea. I can see folks wanting to embedd ZK servers in their products too. I could see the ActiveMQ project embedding it for several reasons. And with that in mind, I think it would be awesome of ZK tried to use more dependency injection (DI) to configure it's objects. That way and embedding project could directly configure it with java code, or use Spring or Guice etc. etc. If you guys are interested in supporting this use case, I'd be happy to start contributing patches to make that happen. -- Regards, Hiram Blog: http://hiramchirino.com Open Source SOA http://open.iona.com -- James --- http://macstrac.blogspot.com/ Open Source Integration http://open.iona.com
[jira] Updated: (ZOOKEEPER-81) JMX module is using 1 java 6 method that has a java 5 equivalent
[ https://issues.apache.org/jira/browse/ZOOKEEPER-81?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hiram Chirino updated ZOOKEEPER-81: --- Attachment: ZOOKEEPER-81.patch attaching patch for the fix. JMX module is using 1 java 6 method that has a java 5 equivalent Key: ZOOKEEPER-81 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-81 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Hiram Chirino Attachments: ZOOKEEPER-81.patch It would be nice if the jmx module compiled and ran on java 5 too. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-81) JMX module is using 1 java 6 method that has a java 5 equivalent
JMX module is using 1 java 6 method that has a java 5 equivalent Key: ZOOKEEPER-81 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-81 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Hiram Chirino Attachments: ZOOKEEPER-81.patch It would be nice if the jmx module compiled and ran on java 5 too. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-81) JMX module is using 1 java 6 method that has a java 5 equivalent
[ https://issues.apache.org/jira/browse/ZOOKEEPER-81?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hiram Chirino updated ZOOKEEPER-81: --- Status: Patch Available (was: Open) {code} Index: src/java/jmx/org/apache/zookeeper/jmx/MBeanRegistry.java === --- src/java/jmx/org/apache/zookeeper/jmx/MBeanRegistry.java(revision 677957) +++ src/java/jmx/org/apache/zookeeper/jmx/MBeanRegistry.java(working copy) @@ -143,7 +143,7 @@ private int tokenize(StringBuilder sb, String path, int index){ String[] tokens = path.split(/); for (String s: tokens) { -if (s.isEmpty()) +if (s.length()==0) continue; sb.append(name).append(index++) .append(=).append(s).append(,); {code} JMX module is using 1 java 6 method that has a java 5 equivalent Key: ZOOKEEPER-81 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-81 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Hiram Chirino Attachments: ZOOKEEPER-81.patch It would be nice if the jmx module compiled and ran on java 5 too. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-81) JMX module is using 1 java 6 method that has a java 5 equivalent
[ https://issues.apache.org/jira/browse/ZOOKEEPER-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12614780#action_12614780 ] Mahadev konar commented on ZOOKEEPER-81: i think the build.xml only allows building of jmx with 1.6? doesnt it? JMX module is using 1 java 6 method that has a java 5 equivalent Key: ZOOKEEPER-81 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-81 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Hiram Chirino Attachments: ZOOKEEPER-81.patch It would be nice if the jmx module compiled and ran on java 5 too. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
auto-reconnection ZooKeeper proxy?
background I work on the ActiveMQ project which implements the JMS API - which is a kinda complex thing but it involves a number of objects (Connections, Sessions, Producers, Consumers). In some JMS providers its the end users responsibility to deal with detecting a connection failure (from any other kind of error) and then automatically recreating all the dependent objects. We added support for auto-reconnection which greatly simplifies the developers life; it lets the JMS client automatically deal with any socket failures, reconnecting to a broker for you and re-establishing all of those in-flight operations (subscriptions, in progress sends and so forth). http://activemq.apache.org/how-can-i-support-auto-reconnection.html Having seen the value of wrapping up the auto-reconnection within a proxy; am thinking its also got merits on ZK /background As we start creating protocols/recipes that implement higher order features like locks, leader elections and so forth we could probably do with some kinda auto-reconnecting facade to ZooKeeper just to simplify the implementation code of protocols/recipes. Its a kinda complex area though and I'm sure different protocols will want different things; but even for something so simple as a lock - I can see the value in an auto-reconnecting proxy. e.g. there's already 5 different method calls in the current WriteLock implementation which all really need a custom try/catch around them to detect loss of the connection which then should be wrapped in a reconnect-retry logic. What to do about watches is interesting; though for now the current behaviour seems fine (fire them all forcing a re-watch) though we could though in the future re-enable watches in the new server connection as an option. All I'm thinking about for now is a kinda ReconnectingZooKeeper which looks like a ZooKeeper object but which internally catches dead connections and then internally tries to reconnect to one of the ZK servers under the covers - retrying the current read/write operation until the ReconnectPolicy says to fail. e.g. some folks might wanna retry connecting forever; others for a certain amount of time or certain number of attempts etc. So something like... public class ReconnectingZooKeeper extends ZooKeeper { ... // for each method that reads/writes synchronously public Stat exists(String path) {... boolean retry = true; for (int count = 0; retry; count++ ) { try { // really do the method call! return super.exists(path); } catch (ConnectionClosedException e) { // lets let any watches or listeners respond to connection loss first before we retry fireAnyWatchesAndStuff(); if (!shouldRetry(count)) { throw e; } } } Any watches should fire when a connection is lost - and all writes should be replicated to the new server we connect to right? So I'm thinking, if we had a ReconnectingZooKeeper implementation, we could use it with the current WriteLock implementation so that the protocol could survive ZK server loss reconnection while still working. e.g. on connection loss the leader/lock owner needs to loose the lock until it gets it back just in case; but other than that I think it should work. Am sure there's some gremlins somewhere in automatically reconnecting; though provided the watch mechanism works, clients will be able to do the right thing I think. Thoughts? -- James --- http://macstrac.blogspot.com/ Open Source Integration http://open.iona.com
[jira] Commented: (ZOOKEEPER-81) JMX module is using 1 java 6 method that has a java 5 equivalent
[ https://issues.apache.org/jira/browse/ZOOKEEPER-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12614787#action_12614787 ] Andrew Kornev commented on ZOOKEEPER-81: The JMX instrumentation code relies on the MXBean feature that is available only since Java 6. The build.xml conditionally includes the JMX code only when compiled under Java 6. JMX module is using 1 java 6 method that has a java 5 equivalent Key: ZOOKEEPER-81 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-81 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Hiram Chirino Attachments: ZOOKEEPER-81.patch It would be nice if the jmx module compiled and ran on java 5 too. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: An interest in increasing the DI'ness of ZooKeeper?
Yep, I've looked that the test cases. In short to make that public API more DI friendly, we should: * Decouple the current configuration system from the public API. I see stuff like ZooKeeperServer being coupled to ServerConfig a bit. * Allow the use of setter injection in addition to constructor injection. This is the most important thing needed to let spring more easily configure the objects. Regards, Hiram On Fri, Jul 18, 2008 at 12:53 PM, Mahadev Konar [EMAIL PROTECTED] wrote: Hi Hiram, Thanks for your feedback. Its great to hear from our users. About your question regarding injecting zookeeper servers in applications, we do have public api' that support creating zookeeper servers in an embedding application. Take a look at our test cases where we create zookeeper servers via the public api. Is this what you were looking for or I misunderstood the reference? Mahadev -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Hiram Chirino Sent: Friday, July 18, 2008 9:07 AM To: zookeeper-dev@hadoop.apache.org Subject: An interest in increasing the DI'ness of ZooKeeper? Hi Guys, First off, great project! I think ZooKeeper is a fabulous idea. I can see folks wanting to embedd ZK servers in their products too. I could see the ActiveMQ project embedding it for several reasons. And with that in mind, I think it would be awesome of ZK tried to use more dependency injection (DI) to configure it's objects. That way and embedding project could directly configure it with java code, or use Spring or Guice etc. etc. If you guys are interested in supporting this use case, I'd be happy to start contributing patches to make that happen. -- Regards, Hiram Blog: http://hiramchirino.com Open Source SOA http://open.iona.com -- Regards, Hiram Blog: http://hiramchirino.com Open Source SOA http://open.iona.com
Re: An interest in increasing the DI'ness of ZooKeeper?
Yeah mainly. Ideally only your main class deals with configuration parsing and it constructs all the zk server objects using the public api which is DI friendly. For example, I think we should move the main() method out of the ZooKeeperServer class. On Fri, Jul 18, 2008 at 1:11 PM, Benjamin Reed [EMAIL PROTECTED] wrote: Can you be a bit more specific and what kind of injection you are talking about? Are you just talking about the server configuration? thanx ben Hiram Chirino wrote: Hi Guys, First off, great project! I think ZooKeeper is a fabulous idea. I can see folks wanting to embedd ZK servers in their products too. I could see the ActiveMQ project embedding it for several reasons. And with that in mind, I think it would be awesome of ZK tried to use more dependency injection (DI) to configure it's objects. That way and embedding project could directly configure it with java code, or use Spring or Guice etc. etc. If you guys are interested in supporting this use case, I'd be happy to start contributing patches to make that happen. -- Regards, Hiram Blog: http://hiramchirino.com Open Source SOA http://open.iona.com
Re: An interest in increasing the DI'ness of ZooKeeper?
This sounds great. I would suggest opening a Jira to work out the proposal and track the patch. ben Hiram Chirino wrote: Yep, I've looked that the test cases. In short to make that public API more DI friendly, we should: * Decouple the current configuration system from the public API. I see stuff like ZooKeeperServer being coupled to ServerConfig a bit. * Allow the use of setter injection in addition to constructor injection. This is the most important thing needed to let spring more easily configure the objects. Regards, Hiram On Fri, Jul 18, 2008 at 12:53 PM, Mahadev Konar [EMAIL PROTECTED] wrote: Hi Hiram, Thanks for your feedback. Its great to hear from our users. About your question regarding injecting zookeeper servers in applications, we do have public api' that support creating zookeeper servers in an embedding application. Take a look at our test cases where we create zookeeper servers via the public api. Is this what you were looking for or I misunderstood the reference? Mahadev -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Hiram Chirino Sent: Friday, July 18, 2008 9:07 AM To: zookeeper-dev@hadoop.apache.org Subject: An interest in increasing the DI'ness of ZooKeeper? Hi Guys, First off, great project! I think ZooKeeper is a fabulous idea. I can see folks wanting to embedd ZK servers in their products too. I could see the ActiveMQ project embedding it for several reasons. And with that in mind, I think it would be awesome of ZK tried to use more dependency injection (DI) to configure it's objects. That way and embedding project could directly configure it with java code, or use Spring or Guice etc. etc. If you guys are interested in supporting this use case, I'd be happy to start contributing patches to make that happen. -- Regards, Hiram Blog: http://hiramchirino.com Open Source SOA http://open.iona.com
Re: Recipe contrib -- was Re: [PATCH] a simple Leader Election or exclusive Write Lock protocol/policy
Some initial implementations of a recipe may only be in C, so it would be nice to have a standard way of finding the recipe that wasn't dependent on the language that implements the recipe. ben James Strachan wrote: 2008/7/17 Benjamin Reed [EMAIL PROTECTED]: Excellent proposal. The only thing I would add is that there should be an english description of the recipe in subversion. That way if someone wanted to do a compatible binding they can do it. If the recipe is on the wiki it would be hard to keep it in sync, so it is important that it is in subversion. My preference would be that the doc would be in the same contrib subdirectory as the source for ease of maintenance. Good idea. How about for Java recipe's we include the documentation as HTML with the javadoc so we can link to it easily and so that the recipe is kept with the code versioned nicely (so as the recipe/algorithm changes we version it with the source code etc)