[jira] Commented: (ZOOKEEPER-22) Automatic request retries on connect failover

2008-07-24 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616374#action_12616374
 ] 

james strachan commented on ZOOKEEPER-22:
-

BTW this discussion came up recently on the dev lists too...

http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-dev/200807.mbox/[EMAIL
 PROTECTED]


To be able to retry operations on conection close (or due to session 
expiration) there is a patch in 
https://issues.apache.org/jira/browse/ZOOKEEPER-78

which adds a ZooKeeperFacade for dealing with reconnecting on session 
expiration and some helper methods in ProtocolSupport for retrying synchronous 
operations or blocks of code in light of connection failures

 Automatic request retries on connect failover
 -

 Key: ZOOKEEPER-22
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-22
 Project: Zookeeper
  Issue Type: New Feature
  Components: c client, java client
Reporter: Patrick Hunt

 Moved from SourceForge to Apache.
 http://sourceforge.net/tracker/index.php?func=detailaid=1831412group_id=209147atid=1008547

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-88) implement java.util.concurrent.locks.Lock

2008-07-24 Thread james strachan (JIRA)
implement java.util.concurrent.locks.Lock
-

 Key: ZOOKEEPER-88
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-88
 Project: Zookeeper
  Issue Type: Sub-task
  Components: java client
Reporter: james strachan


we should implement the 
[java.util.concurrent.locks.Lock|http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/locks/Lock.html]
 to make it easier for folks to reuse the Lock and to help make the API be more 
natural to end users

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-78) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation

2008-07-24 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616416#action_12616416
 ] 

james strachan commented on ZOOKEEPER-78:
-

Just added the WhenOwnerListener interface : 
http://svn.apache.org/viewvc?view=revrevision=679325 I just need to figure out 
how to add notifications of loss of owner/leader status when the connection 
fails or the session expires etc.

 added a high level protocol/feature - for easy Leader Election or exclusive 
 Write Lock creation
 ---

 Key: ZOOKEEPER-78
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-78
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.0.0
Reporter: james strachan
 Attachments: patch_with_including_Benjamin's_fix.patch, 
 using_zookeeper_facade.patch


 Here's a patch which adds a little WriteLock helper class for performing 
 leader elections or creating exclusive locks in some directory znode. Note 
 its an early cut; am sure we can improve it over time. The aim is to avoid 
 folks having to use the low level ZK stuff but provide a simpler high level 
 abstraction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-89) invoke WhenOwnerListener.whenNotOwner() when the ZK connection fails

2008-07-24 Thread james strachan (JIRA)
invoke WhenOwnerListener.whenNotOwner() when the ZK connection fails


 Key: ZOOKEEPER-89
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-89
 Project: Zookeeper
  Issue Type: Sub-task
Reporter: james strachan




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-90) invoke WhenOwnerListener.whenNotOwner() when the ZK session expires and the znode is the leader

2008-07-24 Thread james strachan (JIRA)
invoke WhenOwnerListener.whenNotOwner() when the ZK session expires and the 
znode is the leader
---

 Key: ZOOKEEPER-90
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-90
 Project: Zookeeper
  Issue Type: Sub-task
Reporter: james strachan




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-84) provide a mechanism to reconnect a ZooKeeper if a client receives a SessionExpiredException

2008-07-24 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-84?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616420#action_12616420
 ] 

james strachan commented on ZOOKEEPER-84:
-

BTW here is the code for 
[ZooKeeperFacade|http://svn.apache.org/viewvc/activemq/sandbox/zookeeper/zookeeper-protocols/src/main/java/org/apache/zookeeper/protocols/ZooKeeperFacade.java?view=markup]
 as I've checked in the patch for ZOOKEEPER-78 into a temporary sandbox area, 
[details 
here|https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616391#action_12616391]

 provide a mechanism to reconnect a ZooKeeper if a client receives a 
 SessionExpiredException
 ---

 Key: ZOOKEEPER-84
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-84
 Project: Zookeeper
  Issue Type: Improvement
  Components: java client
Reporter: james strachan
Assignee: Benjamin Reed
 Attachments: reconnect_patch.patch


 am about to attach a patch which adds a reconnect() method to easily 
 re-establish a connection if a session expires - along with a toString() 
 implementation for easier debugging

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-63) Race condition in client close() operation

2008-07-24 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616426#action_12616426
 ] 

james strachan commented on ZOOKEEPER-63:
-

So this patch does not attempt to fix the race condition problem, apologies if 
I gave that impression :)

What it does do though is act as a workaround so that if a client is not able 
to properly send a disconnect packet to the server for *any reason at all* such 
as

* a hung socket (which can be quite common) 
* no servers available
* a race condition in the ZK client code of some kind (which we definitely have 
now)

to not hang the client application forever - as its trying to close and shut 
down anyway :). So its a side benefit that it acts as a band aid until someone 
fixes all the possible race conditions and potential socket hangs.

Let me put it another way. Given that the client is closing; is it really 
correct to leave it potentially hanging around forever just because it cannot 
be sure if the disconnect packet was received and properly processed by the 
server? If the socket is dead before the call to close(), is it really correct 
to block until a connection can be re-established, just so it can be properly 
closed - when the code will effectively close the hung socket without sending a 
disconnect packet anyway :) ? 

The server has to detect and timeout failed sessions; whether it receives an 
explicit disconnect packet or not (as a process could just hang). So do we 
really need to be super strict on the client side, forcing clients to block 
when trying to shut down if they can't do so cleanly within some time period?

I totally agree that we should fix the race condition though :). I just wanted 
a work around to avoid my ZK test cases hanging forever due to the race 
condition :) 

 Race condition in client close() operation
 --

 Key: ZOOKEEPER-63
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-63
 Project: Zookeeper
  Issue Type: Bug
  Components: java client
Reporter: Patrick Hunt
Assignee: Benjamin Reed
 Attachments: patch_ZOOKEEPER-63.patch


 There is a race condition in the java close operation on ZooKeeper.java.
 Client is sending a disconnect request to the server. Server will close any 
 open connections with the client when it receives this. If the client has not 
 yet shutdown it's subthreads (event/send threads for example) these threads 
 may consider the condition an error. We see this alot in the tests where the 
 clients output error logs because they are unaware that a disconnection has 
 been requested by the client.
 Ben mentioned: perhaps we just have to change state to closed (on client) 
 before sending disconnect request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (ZOOKEEPER-90) invoke WhenOwnerListener.whenNotOwner() when the ZK session expires and the znode is the leader

2008-07-24 Thread james strachan (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-90?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

james strachan resolved ZOOKEEPER-90.
-

Resolution: Fixed

this is now fixed in [this 
patch|http://svn.apache.org/viewvc?rev=679341view=rev] to ZOOKEEPER-78

 invoke WhenOwnerListener.whenNotOwner() when the ZK session expires and the 
 znode is the leader
 ---

 Key: ZOOKEEPER-90
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-90
 Project: Zookeeper
  Issue Type: Sub-task
Reporter: james strachan



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-91) provide an option for the WriteLock to also watch the locks own znode, so that if someone else deletes it then it is equivalent to calling WriteLock.unlock()

2008-07-24 Thread james strachan (JIRA)
provide an option for the WriteLock to also watch the locks own znode, so that 
if someone else deletes it then it is equivalent to calling WriteLock.unlock()
-

 Key: ZOOKEEPER-91
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-91
 Project: Zookeeper
  Issue Type: Sub-task
  Components: java client
Reporter: james strachan


Most clients probably wont need this, but it could be a handy system management 
feature to allow the WriteLock to watch its own znode so that if someone else 
deletes it, it then relinquishes the lock and tries to get it back again

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-88) implement java.util.concurrent.locks.Lock

2008-07-24 Thread james strachan (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-88?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

james strachan updated ZOOKEEPER-88:


Status: Patch Available  (was: Open)

I've just submitted an [initial patch at implementing 
this|http://svn.apache.org/viewvc?rev=679435view=rev] which could use some 
more tests and code review

 implement java.util.concurrent.locks.Lock
 -

 Key: ZOOKEEPER-88
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-88
 Project: Zookeeper
  Issue Type: Sub-task
  Components: java client
Reporter: james strachan

 we should implement the 
 [java.util.concurrent.locks.Lock|http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/locks/Lock.html]
  to make it easier for folks to reuse the Lock and to help make the API be 
 more natural to end users

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-74) Cleaning/restructuring up Zookeeper server code

2008-07-24 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-74?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616518#action_12616518
 ] 

james strachan commented on ZOOKEEPER-74:
-

its a very minor thing, but the contributing guide says to use Sun's coding 
conventions..

http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute

yet lots of the code tends to litter fields in classes in between methods; its 
sometimes a bit hard looking at the source to grok what fields are owned by 
what object. I prefer the Sun standards where all the fields are at the top as 
most java folks and apache java projects do. 

Though good IDE's can kinda work around this though and so I tend to rely on 
the outline view in my IDE rather than the source to grok what state a class 
has :)

 Cleaning/restructuring up Zookeeper server code
 ---

 Key: ZOOKEEPER-74
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-74
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Reporter: Mahadev konar
Assignee: Mahadev konar
 Fix For: 3.0.0


 I have been thinking this for a while and find that the zookeeper server code 
 needs some cleaning up. The server code is a little tricky/confusing to read 
 sometimes gievn that there is no clearity on ownership of objects. I will put 
 down a proposal for restructuring/cleaning the code up with javadocs so that 
 the code is easier to understand and develop on. comments on what you find 
 confusing are welcome on this jira. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-83) Switch to using maven to build ZooKeeper

2008-07-23 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12615954#action_12615954
 ] 

james strachan commented on ZOOKEEPER-83:
-

Note its pretty trivial to maintain an Ant build as well as a Maven build if 
folks really have an aversion to Maven. There's really no reason at all to 
disallow a maven build from being created; they can both happily coexist - and 
its also a pretty trivial bit of work - not some huge bit of RD thats gonna 
slow down development on other things. Also note that a non committer has 
already contributed the patch already for you - so no more work is required 
other than committing it :)

Its worth remembering that pretty much all popular Java software at Apache is 
now released into a Maven repository...
http://people.apache.org/repo/m2-ibiblio-rsync-repository/org/apache/

so if folks stick with Ant and refuse to allow a maven build to coexist with 
the Ant build, someone should volunteer to figure out how to hack the Ant build 
to release ZooKeeper into the Apache maven repository - otherwise its pretty 
hard for folks who do use maven to reuse your code (and see from the repo how 
many Apache projects we're talking about not being able to easily reuse 
ZooKeeper).

i.e. as part of the move to the ASF and being a good Apache citizen, I'd 
recommend hugely that as a minimum ZooKeeper releases its jars into the Apache 
Maven repository like most other projects do.

The easiest way to do this is just to reuse a Maven build to do releases (there 
doesn't yet seem to be anything in the Ant build to do actual signed releases 
or deploy builds anywhere) - and let folks who prefer Ant to stick to that for 
day to day development.

The easier it is to reuse a project, the more likely it'll get used and the 
more likely you'll get contributions; at least thats my experience at Apache. 
That doesn't mean you have to force your Ant-loving developers to switch build 
tools!

 Switch to using maven to build ZooKeeper
 

 Key: ZOOKEEPER-83
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-83
 Project: Zookeeper
  Issue Type: Improvement
  Components: build
Reporter: Hiram Chirino
Assignee: Hiram Chirino
 Attachments: zookeeper-mavened.tgz


 Maven is a great too for building java projects at the ASF.  It helps 
 standardize the build a bit since it's a convention oriented.
 It's dependency auto downloading would remove the need to store the 
 dependencies in svn, and it will handle many of the suggested ASF policies 
 like gpg signing of the releases and such.
 The ZooKeeper build is almost vanilla except for the jute compiler bits.  
 Things that would need to change are:
  * re-organize the source tree a little so that it uses the maven directory 
 conventions
  * seperate the jute bits out into seperate modules so that a maven plugin 
 can be with it
  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



mailing lists archived on nabble.com?

2008-07-23 Thread James Strachan
Many apache projects including Hadoop register with nabble to host
online forums  great online archives of the mailing lists...
http://www.nabble.com/Hadoop-f17066.html
Currently there's hadoop-core, hbase and lucene on there.

I often refer to mailing list posts by nabble link; they're really
handy. Plus end users often prefer the forum style nabble approach to
getting every single email sent to a mailing list.

Does a committer on zookeeper fancy registering the ZK mailing lists
too (as a child of the Hadoop list)? I'd do it myself but then I'd be
the owner of the forum which doesn't feel right - a committer should
probably do it.

Its a pretty quick process, click here
http://n2.nabble.com/more/MailingListRequest.jtp

and fill out the details. The hardest part is knowing the mailing list
software which AFAIK is ezmlm :)

-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


Re: things lock up when the client reconnects?

2008-07-23 Thread James Strachan
BTW one other observation; when I use 3 clients in the same JVM (i.e.
3 separate instances of ZooKeeper to try simulate a set of different
processes) I find that each client receives an initial WatchEvent on
startup; then from that point on, only the first 2 clients receive
further watch events for the connection starting/stopping, despite me
closing the server down, waiting a while, restarting the server then
stopping it again etc.

I'm wondering if this is related to why the 3rd client seems to kinda
lock up; that its loosing connection watch events. There's nothing
hard coded somewhere that only allows 2 ZooKeeper clients per JVM or
anything is there? :)

I'm gonna have a look around and see if there's any nasty static
variables around or something... We could maybe do with some more
tests for multiple clients with failover etc.

Anyone else seen something like this?

2008/7/23 James Strachan [EMAIL PROTECTED]:
 2008/7/22 Flavio Junqueira [EMAIL PROTECTED]:
 James, I'd like to clarify what exactly is the issue you're looking at. If
 you provide a list of ZooKeeper servers, then a client will try to reconnect
 to another ZooKeeper server upon a disconnection. Reconnecting to another
 server does not guarantee maintaining the same session, though. So, are you
 trying to guarantee that the session is still the same upon a reconnection?
 If so, I don't think you can do it by just changing the client, since the
 servers might have expired the old session.

 I'm trying to test the WriteLock implementation in the case where the
 server dies and the client reconnects to another server.
 In the test case I'm just running one server, killing it, restarting
 it and trying to get the client to reconnect.

 The test case is WriteLockTest in this patch...
 https://issues.apache.org/jira/browse/ZOOKEEPER-78

 (unfortunately its not been committed yet so I can't easily point you
 at the code). Its very easy to run the test with different numbers of
 clients and see lockups at various places.

 The bizarre thing I've seen is that things do reconnect mostly fine
 (apart from the SessionExpiredException issue in one of the clients)
 https://issues.apache.org/jira/browse/ZOOKEEPER-84

 but a lockup often happens when trying to close down the ZooKeeper instance.

 When running the test case with 3 independent clients and one server;
 I tend to see the last client having a session expired and its often
 the one that locks up; but when running the test with more clients I
 see more lockups elsewhere.

 I just wondered if folks had seen similar lockups when you try
 restarting ZK servers?

 (I'm testing on OS X; this lockup could be timing related maybe).

 --
 James
 ---
 http://macstrac.blogspot.com/

 Open Source Integration
 http://open.iona.com




-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


[jira] Commented: (ZOOKEEPER-84) provide a mechanism to reconnect a ZooKeeper if a client receives a SessionExpiredException

2008-07-23 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-84?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616065#action_12616065
 ] 

james strachan commented on ZOOKEEPER-84:
-

I hear you :)

So an Elect Leader or Write Lock protocol has to deal with expired sessions and 
create new sessions; at some point someone has to recreate something. You can 
pass the buck and say we're not gonna allow the ZooKeeper to reconnect. Then 
say we're not allowed to have the WriteLock reconnect, then the next and next 
layer of the onion. But eventually there's gonna be something somewhere that 
recreates a session :)

For now I'll work on the assumption we're gonna have to have an object which is 
a wrapper around a ZooKeeper so that it can handle reconnections by just 
discarding one ZooKeeper instance and creating another. This object could be 
shared across Protocols (we might wanna reuse one connection with ZK to make 
multiple locks for example).


 provide a mechanism to reconnect a ZooKeeper if a client receives a 
 SessionExpiredException
 ---

 Key: ZOOKEEPER-84
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-84
 Project: Zookeeper
  Issue Type: Improvement
  Components: java client
Reporter: james strachan
Assignee: james strachan
 Attachments: reconnect_patch.patch


 am about to attach a patch which adds a reconnect() method to easily 
 re-establish a connection if a session expires - along with a toString() 
 implementation for easier debugging

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-84) provide a mechanism to reconnect a ZooKeeper if a client receives a SessionExpiredException

2008-07-23 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-84?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616114#action_12616114
 ] 

james strachan commented on ZOOKEEPER-84:
-

You can mark this issue as RESOLVED/WILL_NOT_FIX if you like now - I've 
implemented a ZooKeeperFacade to wrap up the reconnectWithNewSession() logic 
for ZOOKEEPER-78

 provide a mechanism to reconnect a ZooKeeper if a client receives a 
 SessionExpiredException
 ---

 Key: ZOOKEEPER-84
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-84
 Project: Zookeeper
  Issue Type: Improvement
  Components: java client
Reporter: james strachan
Assignee: james strachan
 Attachments: reconnect_patch.patch


 am about to attach a patch which adds a reconnect() method to easily 
 re-establish a connection if a session expires - along with a toString() 
 implementation for easier debugging

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: when should a SessionExpiredException occur?

2008-07-23 Thread James Strachan
2008/7/23 James Strachan [EMAIL PROTECTED]:
 2008/7/23 Benjamin Reed [EMAIL PROTECTED]:
 SessionExpiredExceptions should be extremely rare. Basically they should only
 happen if a machine goes down (of course that would mean no exception would
 actually get generated since the client is dead :) or a network partition
 occurs.

 Having said that we seem to have a bug that cause SessionExpiredExceptions
 when nothing bad has happened. The bug must be in the heart beat code (we do
 them automatically, so the client shouldn't have to worry about it). If you
 can reproduce it well, it would greatly help to track down the bug! Can you
 send me the code to reproduce the problem?

 Its the test case WriteLockTest in the patch for ZOOKEEPER-78 which is
 currently dependent on the ZOOKEEPER-84 patch as well (though given
 your recent comment I'm gonna refactor the code to not require a
 ZooKeeper change :)

 I'll ping the list when I've refactored the test case to not require
 the ZOOKEEPER-84 change.

I've just updated the patch on ZOOKEEPER-78 to avoid the dependency on
ZOOKEEPER-84. It now uses a ZooKeeperFacade class which wraps up the
creation of the ZooKeeper - and recreation of it if a
SessionExpiredException is received.

The test case currently hangs there...

[junit] main prio=5 tid=0x01001710 nid=0xb0801000 in
Object.wait() [0xb07ff000..0xb0800148]
[junit] at java.lang.Object.wait(Native Method)
[junit] - waiting on 0x096105e0 (a
org.apache.zookeeper.ClientCnxn$Packet)
[junit] at java.lang.Object.wait(Object.java:474)
[junit] at
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:822)
[junit] - locked 0x096105e0 (a org.apache.zookeeper.ClientCnxn$Packet)
[junit] at org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:329)
[junit] - locked 0x0bd54108 (a org.apache.zookeeper.ZooKeeper)
[junit] at
org.apache.zookeeper.protocols.ZooKeeperFacade.close(ZooKeeperFacade.java:99)
[junit] at
org.apache.zookeeper.protocols.WriteLockTest.tearDown(WriteLockTest.java:146)
[junit] at junit.framework.TestCase.runBare(TestCase.java:140)
[junit] at junit.framework.TestResult$1.protect(TestResult.java:110)
[junit] at junit.framework.TestResult.runProtected(TestResult.java:128)
[junit] at junit.framework.TestResult.run(TestResult.java:113)
[junit] at junit.framework.TestCase.run(TestCase.java:124)
[junit] at junit.framework.TestSuite.runTest(TestSuite.java:232)
[junit] at junit.framework.TestSuite.run(TestSuite.java:227)
[junit] at
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
[junit] at
junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:36)
[junit] at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:421)
[junit] at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:912)
[junit] at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:766)


basically the 3rd ZooKeeper client cannot close down; it just hangs in
the close() method.

(BTW it might be nice to avoid the close() method waiting forever - it
might as well wait, say, 10 seconds then just close anyway).

Though now I've refactored the code to avoid the patch on ZooKeeper to
deal with reconnecting when a SessionExpiredException occurs, I don't
seem to get any session expired exceptions :). I'm starting to wonder
if its maybe related to old persistent data on disk causing the
exception?

I still get the strange lack of Watch Events on the 3rd client though
and the hang on closing (if
WriteLockTest,workAroundClosingLastZNodeFails is set to false - I've
hacked the test to pass by default).

-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


do the test cases work for anyone else?

2008-07-23 Thread James Strachan
I've always had some tests failing on most boxes I try; I wasn't sure
if everyone else got those or if they do work on some platforms?

On OS X I get these failures
[junit] Test org.apache.zookeeper.test.AsyncTest FAILED
[junit] Test org.apache.zookeeper.test.WatcherFuncTest FAILED

On a linux box (an EC2 box) I get these failures

[junit] Test org.apache.zookeeper.test.ClientTest FAILED
[junit] Test org.apache.zookeeper.test.QuorumTest FAILED

Maybe they all work on windows? :)

I tried adding an explicit forkmode=perTest to the junit and I
still get the same results. Can anyone else get the tests to work on
linux or OS X?

-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


[jira] Updated: (ZOOKEEPER-78) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation

2008-07-23 Thread james strachan (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

james strachan updated ZOOKEEPER-78:


Assignee: (was: james strachan)
  Status: Patch Available  (was: Open)

Patch is now attached

 added a high level protocol/feature - for easy Leader Election or exclusive 
 Write Lock creation
 ---

 Key: ZOOKEEPER-78
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-78
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.0.0
Reporter: james strachan
 Attachments: patch_with_including_Benjamin's_fix.patch, 
 using_zookeeper_facade.patch


 Here's a patch which adds a little WriteLock helper class for performing 
 leader elections or creating exclusive locks in some directory znode. Note 
 its an early cut; am sure we can improve it over time. The aim is to avoid 
 folks having to use the low level ZK stuff but provide a simpler high level 
 abstraction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-63) Race condition in client close() operation

2008-07-23 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616136#action_12616136
 ] 

james strachan commented on ZOOKEEPER-63:
-

I wonder if I've seen this too - I can reliably get a hung test when trying to 
close a client (though the server is still up at the point if the hang).

I'm thinking the close() method should not wait() forever on the disconnect 
packet, just a closeTimeout length - say a few seconds. Afterall blocking and 
forcing a reconnect just to redeliver the disconnect packet seems a bit silly - 
when the server has to deal with clients which just have their sockets fail 
anyway :)

 Race condition in client close() operation
 --

 Key: ZOOKEEPER-63
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-63
 Project: Zookeeper
  Issue Type: Bug
  Components: java client
Reporter: Patrick Hunt
Assignee: Benjamin Reed

 There is a race condition in the java close operation on ZooKeeper.java.
 Client is sending a disconnect request to the server. Server will close any 
 open connections with the client when it receives this. If the client has not 
 yet shutdown it's subthreads (event/send threads for example) these threads 
 may consider the condition an error. We see this alot in the tests where the 
 clients output error logs because they are unaware that a disconnection has 
 been requested by the client.
 Ben mentioned: perhaps we just have to change state to closed (on client) 
 before sending disconnect request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: do the test cases work for anyone else?

2008-07-23 Thread James Strachan
FWIW I've ran the tests a few times; I think all these 4 tests have
timing failures in them. I've seen all of them fail on OS X at some
point. Sometimes only 2 will fail. On Linux I've seen just ClientTest
fail.


2008/7/23 Patrick Hunt [EMAIL PROTECTED]:
 I'm on ubuntu (hardy heron) and they work. Our CI machine has intermittent
 failures (solaris x86):
 http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/

 there's some timing issue, what you're seeing is probably related to:
 https://issues.apache.org/jira/browse/ZOOKEEPER-61

 Frankly tests and docs are both areas that ZooKeeper could use _a lot_ of
 care and feeding. Tests in particular could use some refactoring and a
 better implementation for launching/testing/stopping client/server tests.

 As you're able to reproduce the issue reliably would you like to take on 61?
 Feel free to assign to yourself if so.

As a newbie on the project its hard enough grokking ZK itself and
attempting to contribute patches, but fixing bad test cases of ZK is
even harder :) I was hoping the folks who know ZK really well can fix
the tests they've written :). But I'll take a look and see if I can
see anything obvious I can do to help with my limited knowledge of the
history of the code and internals.

How about we raise a JIRA for all tests that fail?

-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


[jira] Commented: (ZOOKEEPER-84) provide a mechanism to reconnect a ZooKeeper if a client receives a SessionExpiredException

2008-07-23 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-84?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12616144#action_12616144
 ] 

james strachan commented on ZOOKEEPER-84:
-

If ZOOKEEPER-78 ever gets committed (hint, hint :) we can just refer folks to 
the ZooKeeperFacade if ever folks hit the SessionExpiredException

 provide a mechanism to reconnect a ZooKeeper if a client receives a 
 SessionExpiredException
 ---

 Key: ZOOKEEPER-84
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-84
 Project: Zookeeper
  Issue Type: Improvement
  Components: java client
Reporter: james strachan
Assignee: Benjamin Reed
 Attachments: reconnect_patch.patch


 am about to attach a patch which adds a reconnect() method to easily 
 re-establish a connection if a session expires - along with a toString() 
 implementation for easier debugging

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-85) register the ZooKeeper mailing lists with nabble.com

2008-07-23 Thread james strachan (JIRA)
register the ZooKeeper mailing lists with nabble.com


 Key: ZOOKEEPER-85
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-85
 Project: Zookeeper
  Issue Type: Task
Reporter: james strachan
Assignee: Patrick Hunt


Many apache projects including Hadoop register with nabble to host
online forums  great online archives of the mailing lists...
http://www.nabble.com/Hadoop-f17066.html
Currently there's hadoop-core, hbase and lucene on there.

I often refer to mailing list posts by nabble link; they're really
handy. Plus end users often prefer the forum style nabble approach to
getting every single email sent to a mailing list.

Does a committer on zookeeper fancy registering the ZK mailing lists
too (as a child of the Hadoop list)? I'd do it myself but then I'd be
the owner of the forum which doesn't feel right - a committer should
probably do it.

Its a pretty quick process, click here
http://n2.nabble.com/more/MailingListRequest.jtp

and fill out the details. The hardest part is knowing the mailing list
software which AFAIK is ezmlm :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-63) Race condition in client close() operation

2008-07-23 Thread james strachan (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

james strachan updated ZOOKEEPER-63:


Attachment: patch_ZOOKEEPER-63.patch

This patch avoids the close() method blocking forever. It waits just once, up 
to the closeTimeout so if the socket is blocked or some other strangeness is 
going on, the calling thread will only wait up to the timeout (which defaults 
to 2 seconds).

BTW this patch fixes the hang I was having in the test case to ZOOKEEPER-78

 Race condition in client close() operation
 --

 Key: ZOOKEEPER-63
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-63
 Project: Zookeeper
  Issue Type: Bug
  Components: java client
Reporter: Patrick Hunt
Assignee: Benjamin Reed
 Attachments: patch_ZOOKEEPER-63.patch


 There is a race condition in the java close operation on ZooKeeper.java.
 Client is sending a disconnect request to the server. Server will close any 
 open connections with the client when it receives this. If the client has not 
 yet shutdown it's subthreads (event/send threads for example) these threads 
 may consider the condition an error. We see this alot in the tests where the 
 clients output error logs because they are unaware that a disconnection has 
 been requested by the client.
 Ben mentioned: perhaps we just have to change state to closed (on client) 
 before sending disconnect request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-63) Race condition in client close() operation

2008-07-23 Thread james strachan (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

james strachan updated ZOOKEEPER-63:


Status: Patch Available  (was: Open)

about to attach a patch

 Race condition in client close() operation
 --

 Key: ZOOKEEPER-63
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-63
 Project: Zookeeper
  Issue Type: Bug
  Components: java client
Reporter: Patrick Hunt
Assignee: Benjamin Reed
 Attachments: patch_ZOOKEEPER-63.patch


 There is a race condition in the java close operation on ZooKeeper.java.
 Client is sending a disconnect request to the server. Server will close any 
 open connections with the client when it receives this. If the client has not 
 yet shutdown it's subthreads (event/send threads for example) these threads 
 may consider the condition an error. We see this alot in the tests where the 
 clients output error logs because they are unaware that a disconnection has 
 been requested by the client.
 Ben mentioned: perhaps we just have to change state to closed (on client) 
 before sending disconnect request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: assigning JIRAs to non committers

2008-07-23 Thread James Strachan
Cool thanks for the heads up! You live and learn :) Its funny how
totally different all the various Apache projects are and how they get
things done.

My bad for not reading the contributing section of the wiki yet :)

2008/7/23 Patrick Hunt [EMAIL PROTECTED]:
 James Strachan wrote:

 Just an idle observation as I'd never seen this workflow before on
 JIRA so thought I'd ask :)

 I'm new to JIRA as well...

 I've been watching some of the recent JIRA activity with interest.
 I've seen a few JIRAs arrive, someone submits a test case who's not a
 committer, then the issue gets assigned to the person who submitted
 the patch. In some cases; when there may be many patches to assign
 over time, I can understand it (e.g. ZOOKEEPER-78 could take a zillion
 iterations before the feature is complete) - but in general if one
 JIRA gets one patch from a non-committer, should the JIRA be left
 unassigned - or assigned to a committer to review and apply or
 reject-with-reason the patch?

 I believe the workflow is that the jira is assigned to the person resolving
 the issue (ie submiting the patch). You/Hiram have been added as
 contributors to jira, this means that jiras can be assigned to you. We
 typically add ppl to the contributor list as soon as they submit a patch.

 After that point you do the back/forth in the comments trying to get
 everyone to agree to a resolution. If this is a patch you then change the
 status to patch available and ask for review/voting, after which if you
 get a +1 it's then up to a committer to commit to svn.

 full details here:
 http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute

 i.e. lets say I raise a JIRA and attach a patch; once we're at that
 stage I can't actually do anything else, not being a committer - other
 than add another version of the patch :) So am not sure if its worth
 assigning the issue to me. I guess the person who raised the issue 
 submitted the patch can always mark it as unassigned :)

 It's assigned to the person who resolved the issue. If accepted it's up the
 the committers to get it into svn, but you (the resolver) are still
 responsible. This information is also important for reporting purposes.

 No biggie I just thought I'd ask if this was an intentional way you
 guys had worked together in the past?

 This is generally how Hadoop core/hbase do things.

 Patrick




-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


Re: Website

2008-07-23 Thread James Strachan
2008/7/23 Doug Cutting [EMAIL PROTECTED]:
 James Strachan wrote:

 Tools like wikis are personal things; and folks tend to prefer to use
 the tool they know.

 That's a key point.

 To make a switch you'd need:
  1. Someone familiar with Confluence to lead the transition, convert the
 existing website and wiki content, set up static export etc.  Are you
 volunteering?

I would yes, but only if 2) gets approval.

  2. Buy in from Zookeeper's primary contributors, who will end up writing
 and maintaining the documentation (Pat, Ben, etc.).  I don't really count,
 since I'm mostly a kibitzer here.

 Also, with Confluence export, how does one deal with versioning?  A
 convenience of keeping documentation in subversion is that it gets versioned
 with releases.  By maintaining the trunk documentation to match the trunk
 implementation, we automatically get documentation that matches each
 version, but we can still maintain the documentation in release branches.  I
 don't see how this would not add overhead with Confluence exports.  If
 Confluence always represented trunk, and we exported at release branch
 points, then it would be hard to patch branched documentation.  Maintaining
 multiple branches in Confluence would add management overhead, since these
 would need to be synchronized with subversion branching, tagging, etc.  How
 have other projects dealt with this issue?

BTW MoinMoin has the same issue; when documentation is in the wiki you
need to grab a snapshot of it to include in releases (or add it to
svn) to support versioned documentation.

What we've done in the past is copy the static HTML from the wiki with
releases; or in some projects we turn the HTML from Confluence into a
proper manual in PDF or HTML format. e.g.

if you download 1.4.0 of Camel..
http://activemq.apache.org/camel/camel-140-release.html

and look in the docs directory; you'll see a manual in PDF and HTML
format. Thats generated from the wiki whenever there is a release from
these pages
http://activemq.apache.org/camel/book.html
which include various wiki pages together to form a user manual.

which are then included together in this page
http://activemq.apache.org/camel/book-in-one-page.html


Maybe moving away from Forrest is a step too far right now; but its
certainly worth thinking whether for the wiki content its gonna be
MoinMoin or Confluence. Only if you choose Confluence then you can
consider generating a user manual or the static website from it
(neither AFAIK are possible with MoinMoin).


Incidentally a totally different thought; whats gonna be the split
between whats the static website (e.g. Forrest) versus stuff thats in
the wiki versus documentation that goes inside each release? Its often
a kinda slippery slope figuring out which bit does what and its a PITA
moving content into different formats to move between them; so while
no tool is perfect, I kinda like that with confluence there's just one
place to put docs and you can then slice and dice as you see fit (and
make multiple spaces if you want  share content across spaces) to
deal with different version issues etc.

-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


[jira] Updated: (ZOOKEEPER-86) intermittent test failure of org.apache.zookeeper.test.AsyncTest

2008-07-23 Thread james strachan (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-86?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

james strachan updated ZOOKEEPER-86:


Attachment: patch_for_ZOOKEEPER-86.patch

this patch seems to fix the test case on OS X at least; I've split the test 
case into 2 parts (so they are forked separately) and added more delays before 
trying to rebind to the server socket which seems to fix the error

 intermittent test failure of org.apache.zookeeper.test.AsyncTest
 

 Key: ZOOKEEPER-86
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-86
 Project: Zookeeper
  Issue Type: Bug
  Components: tests
 Environment: OS X and linux. It sometimes passes; but mostly seems to 
 fail on OS X each time
Reporter: james strachan
 Attachments: patch_for_ZOOKEEPER-86.patch, 
 TEST-org.apache.zookeeper.test.AsyncTest.txt


 Will attach the test output in an attachment...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-83) Switch to using maven to build ZooKeeper

2008-07-22 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12615546#action_12615546
 ] 

james strachan commented on ZOOKEEPER-83:
-

I just took a look at the patch; basically the maven conventions are to put 
source in ${module-name}/src/main/java and tests in 
${module-name}/src/test/java and resources in src/main/resources etc.

Plus it looks like hiram's split the project into multiple maven modules. (e.g. 
so that the Java 6 JMX code is a separate module so that the core of zookeeper 
can be used on Java 5 - which is a good thing IMHO - plus separating the jute 
stuff so it can be used in development time to generate code etc). Its also 
easy to generate an uber-jar if folks want later on.

This patch looks good to me - assuming folks are happy to go the maven route 
(which many other apache projects do btw - it certainly makes it much easier 
for zookeeper to be reused by other maven projects).

If this patch gets applied I'll happily volunteer to refactor my 
recipes/protocols patch to create a zookeeper-protocols module to create a 
separate jar for higher level stuff

 Switch to using maven to build ZooKeeper
 

 Key: ZOOKEEPER-83
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-83
 Project: Zookeeper
  Issue Type: Improvement
  Components: build
Reporter: Hiram Chirino
Assignee: Hiram Chirino
 Attachments: zookeeper-mavened.tgz


 Maven is a great too for building java projects at the ASF.  It helps 
 standardize the build a bit since it's a convention oriented.
 It's dependency auto downloading would remove the need to store the 
 dependencies in svn, and it will handle many of the suggested ASF policies 
 like gpg signing of the releases and such.
 The ZooKeeper build is almost vanilla except for the jute compiler bits.  
 Things that would need to change are:
  * re-organize the source tree a little so that it uses the maven directory 
 conventions
  * seperate the jute bits out into seperate modules so that a maven plugin 
 can be with it
  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-78) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation

2008-07-22 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12615618#action_12615618
 ] 

james strachan commented on ZOOKEEPER-78:
-

Great catch Benjamin! I've a working patch using your algorithm; am using 
x-sessionId-sequenceNumber and its working a treat (though its a tad hard to 
force ZK to fail mid-create :). 

Am working on some unit tests to try out the server stopping/starting which 
I'll attach shortly once they're working a bit better...

 added a high level protocol/feature - for easy Leader Election or exclusive 
 Write Lock creation
 ---

 Key: ZOOKEEPER-78
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-78
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.0.0
Reporter: james strachan
Assignee: james strachan
 Attachments: writeLock_protocol_version3.patch


 Here's a patch which adds a little WriteLock helper class for performing 
 leader elections or creating exclusive locks in some directory znode. Note 
 its an early cut; am sure we can improve it over time. The aim is to avoid 
 folks having to use the low level ZK stuff but provide a simpler high level 
 abstraction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-84) provide a mechanism to reconnect a ZooKeeper if a client receives a SessionExpiredException

2008-07-22 Thread james strachan (JIRA)
provide a mechanism to reconnect a ZooKeeper if a client receives a 
SessionExpiredException
---

 Key: ZOOKEEPER-84
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-84
 Project: Zookeeper
  Issue Type: Improvement
  Components: java client
Reporter: james strachan


am about to attach a patch which adds a reconnect() method to easily 
re-establish a connection if a session expires - along with a toString() 
implementation for easier debugging

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: auto-reconnection ZooKeeper proxy?

2008-07-22 Thread James Strachan
I've been experimenting with the WriteLock implementation to deal with
server failure; I've found that its maybe too simplistic creating a
reconnecting ZooKeeper proxy; instead I'm just making it easy to retry
operations (or arbitrary ZK code blocks) using a helper class
(currently called ProtocolSupport but am open to suggestions for a
better class name for a base class for higher level protocol
implementations).

Using the WriteLock as an example; it seems you often want the retry
logic to include a number of calls to ZooKeeper; (e.g. check if a
znode exists, if it doesn't try to create it - retrying the whole
thing when ZK exceptions like connection loss occur etc).

I'll submit the patch soon to ZOOKEEPER-78 including this...
https://issues.apache.org/jira/browse/ZOOKEEPER-78

One thing I have found is I've managed to get a
SessionExpiredException in my test case (not sure why though; I
thought ZooKeeper automatically kept sending keep alive pings?). I
just wondered what a client should do if that happens; I didn't see
any easy way to effectively disconnect and reconnect a ZooKeeper
client in this case.

I'm assuming that the SessionExpiredException is always gonna be
possible; so I've patched ZooKeeper to allow clients to handle a
SessionExpiredException and force a reconnection (to get a new
session).

So I've created a small patch to add a reconnect() method to ZooKeeper
which just closes and recreates the cnxn object...
https://issues.apache.org/jira/browse/ZOOKEEPER-84

(I also added a toString() method for easier debugging when running
test cases with multiple clients in the same jvm).

There's maybe a less drastic way to force the re-connection of a
ZooKeeper client; but I figured trashing and recreating the cnxn
object at least is lowest risk and a simple patch :) and the code
should only be executed rarely so performance isn't such an issue.

Thoughts?

2008/7/18 James Strachan [EMAIL PROTECTED]:
 background
 I work on the ActiveMQ project which implements the JMS API - which is
 a kinda complex thing but it involves a number of objects
 (Connections, Sessions, Producers, Consumers). In some JMS providers
 its the end users responsibility to deal with detecting a connection
 failure (from any other kind of error) and then automatically
 recreating all the dependent objects.

 We added support for auto-reconnection which greatly simplifies the
 developers life; it lets the JMS client automatically deal with any
 socket failures, reconnecting to a broker for you and re-establishing
 all of those in-flight operations (subscriptions, in progress sends
 and so forth).
 http://activemq.apache.org/how-can-i-support-auto-reconnection.html

 Having seen the value of wrapping up the auto-reconnection within a
 proxy; am thinking its also got merits on ZK
 /background


 As we start creating protocols/recipes that implement higher order
 features like locks, leader elections and so forth we could probably
 do with some kinda auto-reconnecting facade to ZooKeeper just to
 simplify the implementation code of protocols/recipes. Its a kinda
 complex area though and I'm sure different protocols will want
 different things; but even for something so simple as a lock - I can
 see the value in an auto-reconnecting proxy.

 e.g. there's already 5 different method calls in the current WriteLock
 implementation which all really need a custom try/catch around them to
 detect loss of the connection which then should be wrapped in a
 reconnect-retry logic.

 What to do about watches is interesting; though for now the current
 behaviour seems fine (fire them all forcing a re-watch) though we
 could though in the future re-enable watches in the new server
 connection as an option.

 All I'm thinking about for now is a kinda ReconnectingZooKeeper which
 looks like a ZooKeeper object but which internally catches dead
 connections and then internally tries to reconnect to one of the ZK
 servers under the covers - retrying the current read/write operation
 until the ReconnectPolicy says to fail. e.g. some folks might wanna
 retry connecting forever; others for a certain amount of time or
 certain number of attempts etc.

 So something like...

 public class ReconnectingZooKeeper extends ZooKeeper {
  ...
  // for each method that reads/writes synchronously
  public Stat exists(String path) {...
 boolean retry = true;
 for (int count = 0; retry; count++ ) {
   try {

  // really do the method call!
  return super.exists(path);

   } catch (ConnectionClosedException e) {

  // lets let any watches or listeners respond to connection
 loss first before we retry
  fireAnyWatchesAndStuff();

  if (!shouldRetry(count)) {
 throw e;
   }
   }
 }


 Any watches should fire when a connection is lost - and all writes
 should be replicated to the new server we connect to right? So I'm
 thinking, if we had a ReconnectingZooKeeper implementation, we could
 use

[jira] Updated: (ZOOKEEPER-84) provide a mechanism to reconnect a ZooKeeper if a client receives a SessionExpiredException

2008-07-22 Thread james strachan (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

james strachan updated ZOOKEEPER-84:


Status: Patch Available  (was: Open)

about to submit a match - whoops forgot to add it :)

 provide a mechanism to reconnect a ZooKeeper if a client receives a 
 SessionExpiredException
 ---

 Key: ZOOKEEPER-84
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-84
 Project: Zookeeper
  Issue Type: Improvement
  Components: java client
Reporter: james strachan

 am about to attach a patch which adds a reconnect() method to easily 
 re-establish a connection if a session expires - along with a toString() 
 implementation for easier debugging

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-84) provide a mechanism to reconnect a ZooKeeper if a client receives a SessionExpiredException

2008-07-22 Thread james strachan (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

james strachan updated ZOOKEEPER-84:


Attachment: reconnect_patch.patch

sorry I forgot to add the patch before :) here it is now, hopefully this will 
make more sense now :)

 provide a mechanism to reconnect a ZooKeeper if a client receives a 
 SessionExpiredException
 ---

 Key: ZOOKEEPER-84
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-84
 Project: Zookeeper
  Issue Type: Improvement
  Components: java client
Reporter: james strachan
 Attachments: reconnect_patch.patch


 am about to attach a patch which adds a reconnect() method to easily 
 re-establish a connection if a session expires - along with a toString() 
 implementation for easier debugging

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Recipe contrib -- was Re: [PATCH] a simple Leader Election or exclusive Write Lock protocol/policy

2008-07-21 Thread James Strachan
It should be pretty easy to link together the various recipe's from
the site/wiki and where possible share the recipe documentation across
languages

2008/7/18 Benjamin Reed [EMAIL PROTECTED]:
 Some initial implementations of a recipe may only be in C, so it would
 be nice to have a standard way of finding the recipe that wasn't
 dependent on the language that implements the recipe.

 ben

 James Strachan wrote:
 2008/7/17 Benjamin Reed [EMAIL PROTECTED]:

 Excellent proposal. The only thing I would add is that there should be
 an english description of the recipe in subversion. That way if someone
 wanted to do a compatible binding they can do it. If the recipe is on
 the wiki it would be hard to keep it in sync, so it is important that it
 is in subversion. My preference would be that the doc would be in the
 same contrib subdirectory as the source for ease of maintenance.


 Good idea. How about for Java recipe's we include the documentation as
 HTML with the javadoc so we can link to it easily and so that the
 recipe is kept with the code  versioned nicely (so as the
 recipe/algorithm changes we version it with the source code etc)







-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


javadoc for the Write Lock / Leader Election

2008-07-18 Thread James Strachan
The other thread was already quite big and covering a large range of
issues so thought I'd spin up a little separate thread :)

I've just updated the patch to include better javadoc which is linked
to an embedded HTML documentation describing the protocol. The
documention includes the pseudocode from the online ZooKeeper
presentation (that I used) and I've also included the text from
ZOOKEEPER-79 which I'm glad to say seems to match up perfectly with
the pseudocode I'd used :)
https://issues.apache.org/jira/browse/ZOOKEEPER-78

One thing confused me though; the last paragraph says...

This protocol guarantees that there is at any time only one node that
thinks it is the leader. But it does not disseminate information about
who is the leader. If you want everyone to know who is the leader, you
can have an additional Znode whose value is the name of the current
leader (or some identifying information on how to contact the leader,
etc.). Note that this cannot be done atomically, so by the time other
nodes find out who the leader is, the leadership may already have
passed on to a different node.

In the current implementation, WriteLock - each znode can know,
whenever it attempts to acquire the lock - if it didn't get the lock,
who the owner is. I guess this is only true momentarily the split
second that the acquire() method is called (i.e. the exact moment the
getChildren() is called and the lowest value is found). Or is there
some other subtle issue I'm not seeing?

I guess we could add a method to WriteLock - if folks wanted - a kinda
queryLeader() method where we just use the same algorithm to find who
the current leader is - if folks cared. Though am not sure how useful
knowing who the leader is :). Though I guess writing the leader's
identity to some canonical znode that any other znode can read
whenever it wishes is less risky and maybe simpler.

-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


javadoc for the Write Lock / Leader Election

2008-07-18 Thread James Strachan
The other thread was already quite big and covering a large range of
issues so thought I'd spin up a little separate thread :)

I've just updated the patch to include better javadoc which is linked
to an embedded HTML documentation describing the protocol. The
documention includes the pseudocode from the online ZooKeeper
presentation (that I used) and I've also included the text from
ZOOKEEPER-79 which I'm glad to say seems to match up perfectly with
the pseudocode I'd used :)
https://issues.apache.org/jira/browse/ZOOKEEPER-78

One thing confused me though; the last paragraph says...

This protocol guarantees that there is at any time only one node that
thinks it is the leader. But it does not disseminate information about
who is the leader. If you want everyone to know who is the leader, you
can have an additional Znode whose value is the name of the current
leader (or some identifying information on how to contact the leader,
etc.). Note that this cannot be done atomically, so by the time other
nodes find out who the leader is, the leadership may already have
passed on to a different node.

In the current implementation, WriteLock - each znode can know,
whenever it attempts to acquire the lock - if it didn't get the lock,
who the owner is. I guess this is only true momentarily the split
second that the acquire() method is called (i.e. the exact moment the
getChildren() is called and the lowest value is found). Or is there
some other subtle issue I'm not seeing?

I guess we could add a method to WriteLock - if folks wanted - a kinda
queryLeader() method where we just use the same algorithm to find who
the current leader is - if folks cared. Though am not sure how useful
knowing who the leader is :). Though I guess writing the leader's
identity to some canonical znode that any other znode can read
whenever it wishes is less risky and maybe simpler.

-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


Re: javadoc for the Write Lock / Leader Election

2008-07-18 Thread James Strachan
Thanks for the clarification. I think it makes lots of sense for the
leader to write to some canonical place to advertise itself; if others
are interested in knowing if it is the leader

2008/7/18 Flavio Junqueira [EMAIL PROTECTED]:
 Hi James, the fact that the client's node has another node n ahead of it the
 in the sequence order doesn't mean that the owner of n is aware that it is
 the lock holder or the leader. This is because operations are propagated
 asynchronously. Also, a getChildren() doesn't guarantee that you have the
 latest list, and it is possible that another node is at the head of the
 ordered list of nodes at the moment you read the response of getChildren().
 This is because getChildren() will return the local state of one server,
 while the ensemble of servers is processing or have even already decided
 upon a change to the list.

 In the way I understand Jacob's suggestion, a leader client creates a
 separate node to acknowledge that it is actually aware that it is the
 leader, and so it is ready to perform the role of a leader.

 -Flavio

 -Original Message-

 One thing confused me though; the last paragraph says...

 This protocol guarantees that there is at any time only one node that
 thinks it is the leader. But it does not disseminate information about
 who is the leader. If you want everyone to know who is the leader, you
 can have an additional Znode whose value is the name of the current
 leader (or some identifying information on how to contact the leader,
 etc.). Note that this cannot be done atomically, so by the time other
 nodes find out who the leader is, the leadership may already have
 passed on to a different node.

 In the current implementation, WriteLock - each znode can know,
 whenever it attempts to acquire the lock - if it didn't get the lock,
 who the owner is. I guess this is only true momentarily the split
 second that the acquire() method is called (i.e. the exact moment the
 getChildren() is called and the lowest value is found). Or is there
 some other subtle issue I'm not seeing?

 I guess we could add a method to WriteLock - if folks wanted - a kinda
 queryLeader() method where we just use the same algorithm to find who
 the current leader is - if folks cared. Though am not sure how useful
 knowing who the leader is :). Though I guess writing the leader's
 identity to some canonical znode that any other znode can read
 whenever it wishes is less risky and maybe simpler.

 --
 James
 ---
 http://macstrac.blogspot.com/

 Open Source Integration
 http://open.iona.com





-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


[jira] Updated: (ZOOKEEPER-78) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation

2008-07-18 Thread james strachan (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

james strachan updated ZOOKEEPER-78:


Attachment: (was: writeLock_protocol_with_documentation-version2.patch)

 added a high level protocol/feature - for easy Leader Election or exclusive 
 Write Lock creation
 ---

 Key: ZOOKEEPER-78
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-78
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.0.0
Reporter: james strachan
Assignee: james strachan
 Attachments: writeLock_protocol_version3.patch


 Here's a patch which adds a little WriteLock helper class for performing 
 leader elections or creating exclusive locks in some directory znode. Note 
 its an early cut; am sure we can improve it over time. The aim is to avoid 
 folks having to use the low level ZK stuff but provide a simpler high level 
 abstraction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-78) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation

2008-07-18 Thread james strachan (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

james strachan updated ZOOKEEPER-78:


Attachment: (was: writeLock_protocol.patch)

 added a high level protocol/feature - for easy Leader Election or exclusive 
 Write Lock creation
 ---

 Key: ZOOKEEPER-78
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-78
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.0.0
Reporter: james strachan
Assignee: james strachan
 Attachments: writeLock_protocol_version3.patch


 Here's a patch which adds a little WriteLock helper class for performing 
 leader elections or creating exclusive locks in some directory znode. Note 
 its an early cut; am sure we can improve it over time. The aim is to avoid 
 folks having to use the low level ZK stuff but provide a simpler high level 
 abstraction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-78) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation

2008-07-18 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12614748#action_12614748
 ] 

james strachan commented on ZOOKEEPER-78:
-

BTW I just deleted the other 2 patches to avoid confusion; the latest patch 
includes the previous changes etc

 added a high level protocol/feature - for easy Leader Election or exclusive 
 Write Lock creation
 ---

 Key: ZOOKEEPER-78
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-78
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.0.0
Reporter: james strachan
Assignee: james strachan
 Attachments: writeLock_protocol_version3.patch


 Here's a patch which adds a little WriteLock helper class for performing 
 leader elections or creating exclusive locks in some directory znode. Note 
 its an early cut; am sure we can improve it over time. The aim is to avoid 
 folks having to use the low level ZK stuff but provide a simpler high level 
 abstraction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: An interest in increasing the DI'ness of ZooKeeper?

2008-07-18 Thread James Strachan
+1 :)

I'm a fellow ActiveMQ hacker too and would love to see ZK included
with ActiveMQ. Dependency Injection can really help keep your code
simple but leaving it flexible so it can be used in many different
ways.

Here's some links on DI
http://martinfowler.com/articles/injection.html
http://www.theserverside.com/tt/articles/article.tss?l=SpringFramework

2008/7/18 Hiram Chirino [EMAIL PROTECTED]:
 Hi Guys,

 First off, great project!  I think ZooKeeper is a fabulous idea.  I
 can see folks wanting to embedd ZK servers in their products too.  I
 could see the ActiveMQ project embedding it for several reasons.  And
 with that in mind,  I think it would be awesome of ZK tried to use
 more dependency injection (DI) to configure it's objects.  That way
 and embedding project could directly configure it with java code, or
 use Spring or Guice etc. etc.

 If you guys are interested in supporting this use case, I'd be happy
 to start contributing patches to make that happen.

 --
 Regards,
 Hiram

 Blog: http://hiramchirino.com

 Open Source SOA
 http://open.iona.com




-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


auto-reconnection ZooKeeper proxy?

2008-07-18 Thread James Strachan
background
I work on the ActiveMQ project which implements the JMS API - which is
a kinda complex thing but it involves a number of objects
(Connections, Sessions, Producers, Consumers). In some JMS providers
its the end users responsibility to deal with detecting a connection
failure (from any other kind of error) and then automatically
recreating all the dependent objects.

We added support for auto-reconnection which greatly simplifies the
developers life; it lets the JMS client automatically deal with any
socket failures, reconnecting to a broker for you and re-establishing
all of those in-flight operations (subscriptions, in progress sends
and so forth).
http://activemq.apache.org/how-can-i-support-auto-reconnection.html

Having seen the value of wrapping up the auto-reconnection within a
proxy; am thinking its also got merits on ZK
/background


As we start creating protocols/recipes that implement higher order
features like locks, leader elections and so forth we could probably
do with some kinda auto-reconnecting facade to ZooKeeper just to
simplify the implementation code of protocols/recipes. Its a kinda
complex area though and I'm sure different protocols will want
different things; but even for something so simple as a lock - I can
see the value in an auto-reconnecting proxy.

e.g. there's already 5 different method calls in the current WriteLock
implementation which all really need a custom try/catch around them to
detect loss of the connection which then should be wrapped in a
reconnect-retry logic.

What to do about watches is interesting; though for now the current
behaviour seems fine (fire them all forcing a re-watch) though we
could though in the future re-enable watches in the new server
connection as an option.

All I'm thinking about for now is a kinda ReconnectingZooKeeper which
looks like a ZooKeeper object but which internally catches dead
connections and then internally tries to reconnect to one of the ZK
servers under the covers - retrying the current read/write operation
until the ReconnectPolicy says to fail. e.g. some folks might wanna
retry connecting forever; others for a certain amount of time or
certain number of attempts etc.

So something like...

public class ReconnectingZooKeeper extends ZooKeeper {
  ...
  // for each method that reads/writes synchronously
  public Stat exists(String path) {...
 boolean retry = true;
 for (int count = 0; retry; count++ ) {
   try {

  // really do the method call!
  return super.exists(path);

   } catch (ConnectionClosedException e) {

  // lets let any watches or listeners respond to connection
loss first before we retry
  fireAnyWatchesAndStuff();

  if (!shouldRetry(count)) {
 throw e;
   }
   }
}


Any watches should fire when a connection is lost - and all writes
should be replicated to the new server we connect to right? So I'm
thinking, if we had a ReconnectingZooKeeper implementation, we could
use it with the current WriteLock implementation so that the protocol
could survive ZK server loss  reconnection while still working.

e.g. on connection loss the leader/lock owner needs to loose the lock
until it gets it back just in case; but other than that I think it
should work.

Am sure there's some gremlins somewhere in automatically reconnecting;
though provided the watch mechanism works, clients will be able to do
the right thing I think.

Thoughts?

-- 
James
---
http://macstrac.blogspot.com/

Open Source Integration
http://open.iona.com


[jira] Updated: (ZOOKEEPER-78) added a high level protocol/feature - for easy Leader Election or exclusive Write Lock creation

2008-07-17 Thread james strachan (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

james strachan updated ZOOKEEPER-78:


Attachment: writeLock_protocol.patch

 added a high level protocol/feature - for easy Leader Election or exclusive 
 Write Lock creation
 ---

 Key: ZOOKEEPER-78
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-78
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.0.0
Reporter: james strachan
 Attachments: writeLock_protocol.patch


 Here's a patch which adds a little WriteLock helper class for performing 
 leader elections or creating exclusive locks in some directory znode. Note 
 its an early cut; am sure we can improve it over time. The aim is to avoid 
 folks having to use the low level ZK stuff but provide a simpler high level 
 abstraction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-79) Document jacob's leader election on the wiki recipes page

2008-07-17 Thread james strachan (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12614448#action_12614448
 ] 

james strachan commented on ZOOKEEPER-79:
-

Ah cool :) Was just checking we were not about to do the same thing separate :).

I've basically followed the same algorithm from the wiki recipe - and the same 
one described in the ZooKeeper tutorial...
http://developer.yahoo.com/blogs/hadoop/2008/03/intro-to-zookeeper-video.html

So AFAIK yes its the same




 Document jacob's leader election on the wiki recipes page
 -

 Key: ZOOKEEPER-79
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-79
 Project: Zookeeper
  Issue Type: New Feature
  Components: documentation
Reporter: Patrick Hunt
Assignee: Patrick Hunt

 The following discussion occurred on the zookeeper-user list. We need to 
 formalize this recipe and document on the wiki recipes page:
 -from jacob 
 Avinash
  
 The following protocol will help you fix the observed misbehavior. As Flavio 
 points out, you cannot rely on the order of nodes in getChildren, you must 
 use an intrinsic property of each node to determine who is the leader. The 
 protocol devised by Runping Qi and described here will do that.
  
 First of all, when you create child nodes of the node that holds the 
 leadership bids, you must create them with the EPHEMERAL and SEQUENCE flag. 
 ZooKeeper guarantees to give you an ephemeral node named uniquely and with a 
 sequence number larger by at least one than any previously created node in 
 the sequence. You provide a prefix, like L_ or your own choice, and 
 ZooKeeper creates nodes named L_23, L_24, etc. The sequence number starts 
 at 0 and increases monotonously.
  
 Once you've placed your leadership bid, you search backwards from the 
 sequence number of *your* node to see if there are any preceding (in terms of 
 the sequence number) nodes. When you find one, you place a watch on it and 
 wait for it to disappear. When you get the watch notification, you search 
 again, until you do not find a preceding node, then you know you're the 
 leader. This protocol guarantees that there is at any time only one node that 
 thinks it is the leader. But it does not disseminate information about who is 
 the leader. If you want everyone to know who is the leader, you can have an 
 additional Znode whose value is the name of the current leader (or some 
 identifying information on how to contact the leader, etc.). Note that this 
 cannot be done atomically, so by the time other nodes find out who the leader 
 is, the leadership may already have passed on to a different node.
  
 Flavio
  
 Might it make sense to provide a standardized implementation of leader 
 election in the library code in Java?
  
 --Jacob
  
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Flavio 
 Junqueira
 Sent: Friday, July 11, 2008 1:02 AM
 To: [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Subject: Re: [Zookeeper-user] Leader election
  
 Hi Avinash, getChildren returns a list in lexicographic order, so if you are 
 updating the children of the election node concurrently, then you may get a 
 different first node with different clients. If you are using the sequence 
 flag to create nodes, then you may consider stripping the prefix of the node 
 name and using the sufix value to determine order.
 Hope it helps.
 -Flavio
  
 - Original Message 
 From: Avinash Lakshman [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Sent: Friday, July 11, 2008 7:20:06 AM
 Subject: [Zookeeper-user] Leader election
 Hi
 I am trying to elect leader among 50 nodes. There is always one odd guy who 
 seems to think that someone else distinct from what some other nodes see as 
 leader. Could someone please tell me what is wrong with the following code 
 for leader election:
 public void electLeader()
 {   
 ZooKeeper zk = StorageService.instance().getZooKeeperHandle();
 String path = /Leader;
 try
 {
 String createPath = path + /L-; 
   
 LeaderElector.createLock_.lock();
 while( true )
 {
 /* Get all znodes under the Leader znode */
 ListString values = zk.getChildren(path, false);
 /*
  * Get the first znode and if it is the
  * pathCreated created above then the data
  * in that znode is the leader's identity.
 */
 if ( leader_ == null )
 {
 leader_ = new AtomicReferenceEndPoint( 
 EndPoint.fromBytes( zk.getData(path + / + values.get