RE: Updated NodeWatcher...
yeah, i was thinking it should be in forrest, but i couldn't figure out where to put it. that is why i didn't close the issue. ben -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Friday, January 09, 2009 9:37 AM To: zookeeper-user@hadoop.apache.org Subject: Re: Updated NodeWatcher... Ben this is great, thanks! Do you want to close out this one and point to the faq? https://issues.apache.org/jira/browse/ZOOKEEPER-264 Although IMO this should be moved to the forrest docs. Patrick Benjamin Reed wrote: > I'm really bad a creating figures, but i've put up something that should be > informative. (i'm also really bad at apache wiki.) hopefully someone can make > it more beautiful. i've added the state diagram to the FAQ: > http://wiki.apache.org/hadoop/ZooKeeper/FAQ > > ben > > -Original Message- > From: adam.ros...@gmail.com [mailto:adam.ros...@gmail.com] On Behalf Of Adam > Rosien > Sent: Thursday, January 08, 2009 8:06 PM > To: zookeeper-user@hadoop.apache.org > Subject: Re: Updated NodeWatcher... > > It feels like we need a flowchart, state-chart, or something, so we > can all talk about the same thing. Then people could suggest > abstractions that would essentially put a box around sections of the > diagram. However I feel woefully inadequate at the former :(. > > .. Adam > > On Thu, Jan 8, 2009 at 4:20 PM, Benjamin Reed wrote: >> For your first issue if an ensemble goes offline and comes back, everything >> should be fine. it will look to the client just like a server went down. if >> a session expires, you are correct that the client will not reconnect. this >> again is on purpose. for the node watcher the session is unimportant, but if >> the ZooKeeper object is also being used for leader election, for example, >> you do not want the object to grab a new session automatically. >> >> For 2) i think pat responded to that one. an async request will always >> return. if the server goes down after the request is issued, you will get a >> connection loss error in your callback. >> >> Your third issued is described with the first. >> >> ben >> >> -Original Message- >> From: burtona...@gmail.com [mailto:burtona...@gmail.com] On Behalf Of Kevin >> Burton >> Sent: Thursday, January 08, 2009 4:02 PM >> To: zookeeper-user@hadoop.apache.org >> Subject: Re: Updated NodeWatcher... >> >>> >>> i just found that part of this thread went to my junk folder. can you send >>> the URL for the NodeListener? >>> >> Sure... here you go: >> >> http://pastebin.com/f1e9d3706 >> >> >>> this NodeWatcher is a useful thing. i have a couple of suggestions to >>> simplify it: >>> >>> 1) Construct the NodeWatcher with a ZooKeeper object rather than >>> constructing one. Not only does it simplify NodeWatcher, but it also makes >>> it so that the ZooKeeper object can be used for other things as well. >> >> I hear you I was thinking that this might not be a good idea because >> NodeWatcher can reconnect you to the ensemble if it goes offline. >> >> I'm not sure if it's a bug or not but once my session expired on the client >> it wouldn't reconnect so I just implemented my own reconnect and session >> expiry. >> >> >>> 2) Use the async API in watchNodeData and watchNodeExists. it simplifies >>> the code and the error handling. >> >> The problem was that according to feedback here an async request might never >> return if the server dies shortly after the request and before it has a >> change to respond. >> >> I wanted NodeWatcher to hide as much rope as possible. >> >> >>> 3) You don't need to do a connect() in handleDisconnected(). ZooKeeper >>> object will do it automatically for you. >>> >>> >> I can try again if you'd like by this isn't my experience. Once the session >> expired and the whole ensemble was offline it wouldn't connect again. >> >> If it was a transient disconnect I'd see on disconnect event and then a >> quick reconnect. If it was a long disconnect (with nothing to attach to) >> then ZK won't ever reconnect me. >> >> I'd like this to be the behavior though... >> >> >>> There is an old example on sourceforge >>> http://zookeeper.wiki.sourceforge.net/ZooKeeperJavaExample that may give >>> you some more ideas on how to simplify your code. >>> >> That would be nice simple is good! >> >> Kevin >> >> >> -- >> Founder/CEO Spinn3r.com >> Location: San Francisco, CA >> AIM/YIM: sfburtonator >> Skype: burtonator >> Work: http://spinn3r.com >>
Re: A modest proposal for simplifying zookeeper :)
Well if that were the direction, goal, I'd feel more comfortable about recommending ZK.. If a company were to implement some of these algorithms then I suspect they'd run into a race condition, etc with all that rope. For my part I'd be willing to contribute the NodeWatcher/NodeListener I wrote. Would be nice to have unit test for it that has all possible/unusual race conditions with ZK. Kevin On Fri, Jan 9, 2009 at 11:58 AM, Mahadev Konar wrote: > Hi Kevin, > It would be great to have such high level interfaces. It could be > something that you could contribute :) . We havent had the bandwidth to > provide such interfaces for zookeeper. It would be great to have all such > recipes as a part of contrib package of zookeeper. > > mahadev > > On 1/9/09 11:44 AM, "Kevin Burton" wrote: > > > OK so it sounds from the group that there are still reasons to > provide > > rope in ZK to enable algorithms like leader election. > > Couldn't ZK ship higher level interfaces for leader election, mutexes, > > semapores, queues, barriers, etc instead of pushing this on developers? > > > > Then the remaining APIs, configuration, event notification, and > discovery, > > can be used on a simpler, rope free API. > > > > The rope is what's killing me now :) > > > > Kevin > > -- Founder/CEO Spinn3r.com Location: San Francisco, CA AIM/YIM: sfburtonator Skype: burtonator Work: http://spinn3r.com
Re: A modest proposal for simplifying zookeeper :)
Hi Kevin, It would be great to have such high level interfaces. It could be something that you could contribute :) . We havent had the bandwidth to provide such interfaces for zookeeper. It would be great to have all such recipes as a part of contrib package of zookeeper. mahadev On 1/9/09 11:44 AM, "Kevin Burton" wrote: > OK so it sounds from the group that there are still reasons to provide > rope in ZK to enable algorithms like leader election. > Couldn't ZK ship higher level interfaces for leader election, mutexes, > semapores, queues, barriers, etc instead of pushing this on developers? > > Then the remaining APIs, configuration, event notification, and discovery, > can be used on a simpler, rope free API. > > The rope is what's killing me now :) > > Kevin
A modest proposal for simplifying zookeeper :)
OK so it sounds from the group that there are still reasons to provide rope in ZK to enable algorithms like leader election. Couldn't ZK ship higher level interfaces for leader election, mutexes, semapores, queues, barriers, etc instead of pushing this on developers? Then the remaining APIs, configuration, event notification, and discovery, can be used on a simpler, rope free API. The rope is what's killing me now :) Kevin -- Founder/CEO Spinn3r.com Location: San Francisco, CA AIM/YIM: sfburtonator Skype: burtonator Work: http://spinn3r.com
Re: Updated NodeWatcher...
Thanks. And yes, your chart is really ugly! :) These are the states of... what? The session? The ZooKeeper object? It would be nice to include the corresponding API references. .. Adam On Fri, Jan 9, 2009 at 5:09 AM, Benjamin Reed wrote: > I'm really bad a creating figures, but i've put up something that should be > informative. (i'm also really bad at apache wiki.) hopefully someone can make > it more beautiful. i've added the state diagram to the FAQ: > http://wiki.apache.org/hadoop/ZooKeeper/FAQ > > ben > > -Original Message- > From: adam.ros...@gmail.com [mailto:adam.ros...@gmail.com] On Behalf Of Adam > Rosien > Sent: Thursday, January 08, 2009 8:06 PM > To: zookeeper-user@hadoop.apache.org > Subject: Re: Updated NodeWatcher... > > It feels like we need a flowchart, state-chart, or something, so we > can all talk about the same thing. Then people could suggest > abstractions that would essentially put a box around sections of the > diagram. However I feel woefully inadequate at the former :(. > > .. Adam > > On Thu, Jan 8, 2009 at 4:20 PM, Benjamin Reed wrote: >> For your first issue if an ensemble goes offline and comes back, everything >> should be fine. it will look to the client just like a server went down. if >> a session expires, you are correct that the client will not reconnect. this >> again is on purpose. for the node watcher the session is unimportant, but if >> the ZooKeeper object is also being used for leader election, for example, >> you do not want the object to grab a new session automatically. >> >> For 2) i think pat responded to that one. an async request will always >> return. if the server goes down after the request is issued, you will get a >> connection loss error in your callback. >> >> Your third issued is described with the first. >> >> ben >> >> -Original Message- >> From: burtona...@gmail.com [mailto:burtona...@gmail.com] On Behalf Of Kevin >> Burton >> Sent: Thursday, January 08, 2009 4:02 PM >> To: zookeeper-user@hadoop.apache.org >> Subject: Re: Updated NodeWatcher... >> >>> >>> >>> i just found that part of this thread went to my junk folder. can you send >>> the URL for the NodeListener? >>> >> >> Sure... here you go: >> >> http://pastebin.com/f1e9d3706 >> >> >>> >>> this NodeWatcher is a useful thing. i have a couple of suggestions to >>> simplify it: >>> >>> 1) Construct the NodeWatcher with a ZooKeeper object rather than >>> constructing one. Not only does it simplify NodeWatcher, but it also makes >>> it so that the ZooKeeper object can be used for other things as well. >> >> >> I hear you I was thinking that this might not be a good idea because >> NodeWatcher can reconnect you to the ensemble if it goes offline. >> >> I'm not sure if it's a bug or not but once my session expired on the client >> it wouldn't reconnect so I just implemented my own reconnect and session >> expiry. >> >> >>> >>> 2) Use the async API in watchNodeData and watchNodeExists. it simplifies >>> the code and the error handling. >> >> >> The problem was that according to feedback here an async request might never >> return if the server dies shortly after the request and before it has a >> change to respond. >> >> I wanted NodeWatcher to hide as much rope as possible. >> >> >>> 3) You don't need to do a connect() in handleDisconnected(). ZooKeeper >>> object will do it automatically for you. >>> >>> >> I can try again if you'd like by this isn't my experience. Once the session >> expired and the whole ensemble was offline it wouldn't connect again. >> >> If it was a transient disconnect I'd see on disconnect event and then a >> quick reconnect. If it was a long disconnect (with nothing to attach to) >> then ZK won't ever reconnect me. >> >> I'd like this to be the behavior though... >> >> >>> There is an old example on sourceforge >>> http://zookeeper.wiki.sourceforge.net/ZooKeeperJavaExample that may give >>> you some more ideas on how to simplify your code. >>> >> >> That would be nice simple is good! >> >> Kevin >> >> >> -- >> Founder/CEO Spinn3r.com >> Location: San Francisco, CA >> AIM/YIM: sfburtonator >> Skype: burtonator >> Work: http://spinn3r.com >> >
Re: InterruptedException
I've really wanted @threadsafe for a while for methods that are safe so that you can have compiler errors when calling non-threadsafe APIs. Kevin On Fri, Jan 9, 2009 at 8:08 AM, Benjamin Reed wrote: > yes, that is a good article. it is actually the one we used to decide about > the current way of handling InterruptedException. in retrospect it turns out > to be a nice way to document that a call is blocking. > > ben > > -Original Message- > From: thomas.john...@sun.com [mailto:thomas.john...@sun.com] > Sent: Friday, January 09, 2009 7:56 AM > To: zookeeper-user@hadoop.apache.org > Subject: Re: InterruptedException > > Kevin Burton wrote: > > On Thu, Jan 8, 2009 at 3:21 PM, Benjamin Reed > wrote: > > > > > >> InterruptedException is rather tricky because the semantics of > >> Thread.isInterrupted() is rather vague. specifically, it is unclear why > >> someone would interrupt a thread. usually Thread.interrupt() is used to > shut > >> things down which requires special handing. thus we propagate it. for > >> example, i'm not clear how you shutdown your poll() method. an easy way > to > >> do it would be to use Thread.interrupt(). > >> > >> > > > > so if you just don't have it throw InterruptedException then > > thread.interrupt can't be used. It's an API decision really... if you > don't > > want people to interrupt then we don't have to throw > InterruptedException. > > I don't know many applications that use this in practice... does anyone > on > > this list? I always felt interrupt() was vestigial... notify/wait are > > somewhat in the same category IMO but at least they are useful. > > > > > See http://www-128.ibm.com/developerworks/java/library/j-jtp05236.html > for the idiom associated with thread interrupt methods and > InterruptedException. I will hasten to add that the appropriate use of > this idiom in various libraries (including in JDK libraries) is > inconsistent at best and the behavior is in some cases OS dependent. So > being careful about the use (maybe even stearing clear) of interrupt for > 'cancelable' operations is probably wise. > > In my poll method I don't shutdown the goal was to have the developer > do > > everything in an event API and code that way. > > > > Doing things in both sync and async operations is confusing. > > > > Kevin > > > > > > -- Founder/CEO Spinn3r.com Location: San Francisco, CA AIM/YIM: sfburtonator Skype: burtonator Work: http://spinn3r.com
Re: Updated NodeWatcher...
Ben this is great, thanks! Do you want to close out this one and point to the faq? https://issues.apache.org/jira/browse/ZOOKEEPER-264 Although IMO this should be moved to the forrest docs. Patrick Benjamin Reed wrote: I'm really bad a creating figures, but i've put up something that should be informative. (i'm also really bad at apache wiki.) hopefully someone can make it more beautiful. i've added the state diagram to the FAQ: http://wiki.apache.org/hadoop/ZooKeeper/FAQ ben -Original Message- From: adam.ros...@gmail.com [mailto:adam.ros...@gmail.com] On Behalf Of Adam Rosien Sent: Thursday, January 08, 2009 8:06 PM To: zookeeper-user@hadoop.apache.org Subject: Re: Updated NodeWatcher... It feels like we need a flowchart, state-chart, or something, so we can all talk about the same thing. Then people could suggest abstractions that would essentially put a box around sections of the diagram. However I feel woefully inadequate at the former :(. .. Adam On Thu, Jan 8, 2009 at 4:20 PM, Benjamin Reed wrote: For your first issue if an ensemble goes offline and comes back, everything should be fine. it will look to the client just like a server went down. if a session expires, you are correct that the client will not reconnect. this again is on purpose. for the node watcher the session is unimportant, but if the ZooKeeper object is also being used for leader election, for example, you do not want the object to grab a new session automatically. For 2) i think pat responded to that one. an async request will always return. if the server goes down after the request is issued, you will get a connection loss error in your callback. Your third issued is described with the first. ben -Original Message- From: burtona...@gmail.com [mailto:burtona...@gmail.com] On Behalf Of Kevin Burton Sent: Thursday, January 08, 2009 4:02 PM To: zookeeper-user@hadoop.apache.org Subject: Re: Updated NodeWatcher... i just found that part of this thread went to my junk folder. can you send the URL for the NodeListener? Sure... here you go: http://pastebin.com/f1e9d3706 this NodeWatcher is a useful thing. i have a couple of suggestions to simplify it: 1) Construct the NodeWatcher with a ZooKeeper object rather than constructing one. Not only does it simplify NodeWatcher, but it also makes it so that the ZooKeeper object can be used for other things as well. I hear you I was thinking that this might not be a good idea because NodeWatcher can reconnect you to the ensemble if it goes offline. I'm not sure if it's a bug or not but once my session expired on the client it wouldn't reconnect so I just implemented my own reconnect and session expiry. 2) Use the async API in watchNodeData and watchNodeExists. it simplifies the code and the error handling. The problem was that according to feedback here an async request might never return if the server dies shortly after the request and before it has a change to respond. I wanted NodeWatcher to hide as much rope as possible. 3) You don't need to do a connect() in handleDisconnected(). ZooKeeper object will do it automatically for you. I can try again if you'd like by this isn't my experience. Once the session expired and the whole ensemble was offline it wouldn't connect again. If it was a transient disconnect I'd see on disconnect event and then a quick reconnect. If it was a long disconnect (with nothing to attach to) then ZK won't ever reconnect me. I'd like this to be the behavior though... There is an old example on sourceforge http://zookeeper.wiki.sourceforge.net/ZooKeeperJavaExample that may give you some more ideas on how to simplify your code. That would be nice simple is good! Kevin -- Founder/CEO Spinn3r.com Location: San Francisco, CA AIM/YIM: sfburtonator Skype: burtonator Work: http://spinn3r.com
Re: Simpler ZooKeeper event interface....
In the case of an active leader, L continues to send commands (whatever) to the followers. However a new leader L' has since been elected and is also sending commands to the followers. In this case it seems like either a) L should not send commands if it's not sync'd to the ensemble (and holds the leader token) or b) followers should not accept commands from non-leader (only accept from the current leader). a) seems the right way to go; if L is disconnected it should stop sending commands to the followers, if it's resync'd in time it can Seems to make sense in this particular case (I had some other cases in mind that I'm not so sure about though) Feel free to discuss... The thought is not that well formed, so perhaps it does not warrant much discussion ... This is more a realization that as far as the leader election recipe goes, if *in general* one wants to guarantee not having multiple leaders at the same time, certain assumptions have to made about timely reception and processing of events. So naively, if I wanted to use the recipe to ensure that only one system owns an IP address at any given time, I think there would be no way to guarantee it without making some assumptions about timing. In retrospect, this should have been obvious. In practice it may be simple enough to work around these problems (I actually think now that in my case an 'at least once' queue is more appropriate). Any way, like I said half baked thoughts ..
RE: InterruptedException
yes, that is a good article. it is actually the one we used to decide about the current way of handling InterruptedException. in retrospect it turns out to be a nice way to document that a call is blocking. ben -Original Message- From: thomas.john...@sun.com [mailto:thomas.john...@sun.com] Sent: Friday, January 09, 2009 7:56 AM To: zookeeper-user@hadoop.apache.org Subject: Re: InterruptedException Kevin Burton wrote: > On Thu, Jan 8, 2009 at 3:21 PM, Benjamin Reed wrote: > > >> InterruptedException is rather tricky because the semantics of >> Thread.isInterrupted() is rather vague. specifically, it is unclear why >> someone would interrupt a thread. usually Thread.interrupt() is used to shut >> things down which requires special handing. thus we propagate it. for >> example, i'm not clear how you shutdown your poll() method. an easy way to >> do it would be to use Thread.interrupt(). >> >> > > so if you just don't have it throw InterruptedException then > thread.interrupt can't be used. It's an API decision really... if you don't > want people to interrupt then we don't have to throw InterruptedException. > I don't know many applications that use this in practice... does anyone on > this list? I always felt interrupt() was vestigial... notify/wait are > somewhat in the same category IMO but at least they are useful. > > See http://www-128.ibm.com/developerworks/java/library/j-jtp05236.html for the idiom associated with thread interrupt methods and InterruptedException. I will hasten to add that the appropriate use of this idiom in various libraries (including in JDK libraries) is inconsistent at best and the behavior is in some cases OS dependent. So being careful about the use (maybe even stearing clear) of interrupt for 'cancelable' operations is probably wise. > In my poll method I don't shutdown the goal was to have the developer do > everything in an event API and code that way. > > Doing things in both sync and async operations is confusing. > > Kevin > >
Re: InterruptedException
Kevin Burton wrote: On Thu, Jan 8, 2009 at 3:21 PM, Benjamin Reed wrote: InterruptedException is rather tricky because the semantics of Thread.isInterrupted() is rather vague. specifically, it is unclear why someone would interrupt a thread. usually Thread.interrupt() is used to shut things down which requires special handing. thus we propagate it. for example, i'm not clear how you shutdown your poll() method. an easy way to do it would be to use Thread.interrupt(). so if you just don't have it throw InterruptedException then thread.interrupt can't be used. It's an API decision really... if you don't want people to interrupt then we don't have to throw InterruptedException. I don't know many applications that use this in practice... does anyone on this list? I always felt interrupt() was vestigial... notify/wait are somewhat in the same category IMO but at least they are useful. See http://www-128.ibm.com/developerworks/java/library/j-jtp05236.html for the idiom associated with thread interrupt methods and InterruptedException. I will hasten to add that the appropriate use of this idiom in various libraries (including in JDK libraries) is inconsistent at best and the behavior is in some cases OS dependent. So being careful about the use (maybe even stearing clear) of interrupt for 'cancelable' operations is probably wise. In my poll method I don't shutdown the goal was to have the developer do everything in an event API and code that way. Doing things in both sync and async operations is confusing. Kevin
RE: Updated NodeWatcher...
I'm really bad a creating figures, but i've put up something that should be informative. (i'm also really bad at apache wiki.) hopefully someone can make it more beautiful. i've added the state diagram to the FAQ: http://wiki.apache.org/hadoop/ZooKeeper/FAQ ben -Original Message- From: adam.ros...@gmail.com [mailto:adam.ros...@gmail.com] On Behalf Of Adam Rosien Sent: Thursday, January 08, 2009 8:06 PM To: zookeeper-user@hadoop.apache.org Subject: Re: Updated NodeWatcher... It feels like we need a flowchart, state-chart, or something, so we can all talk about the same thing. Then people could suggest abstractions that would essentially put a box around sections of the diagram. However I feel woefully inadequate at the former :(. .. Adam On Thu, Jan 8, 2009 at 4:20 PM, Benjamin Reed wrote: > For your first issue if an ensemble goes offline and comes back, everything > should be fine. it will look to the client just like a server went down. if a > session expires, you are correct that the client will not reconnect. this > again is on purpose. for the node watcher the session is unimportant, but if > the ZooKeeper object is also being used for leader election, for example, you > do not want the object to grab a new session automatically. > > For 2) i think pat responded to that one. an async request will always > return. if the server goes down after the request is issued, you will get a > connection loss error in your callback. > > Your third issued is described with the first. > > ben > > -Original Message- > From: burtona...@gmail.com [mailto:burtona...@gmail.com] On Behalf Of Kevin > Burton > Sent: Thursday, January 08, 2009 4:02 PM > To: zookeeper-user@hadoop.apache.org > Subject: Re: Updated NodeWatcher... > >> >> >> i just found that part of this thread went to my junk folder. can you send >> the URL for the NodeListener? >> > > Sure... here you go: > > http://pastebin.com/f1e9d3706 > > >> >> this NodeWatcher is a useful thing. i have a couple of suggestions to >> simplify it: >> >> 1) Construct the NodeWatcher with a ZooKeeper object rather than >> constructing one. Not only does it simplify NodeWatcher, but it also makes >> it so that the ZooKeeper object can be used for other things as well. > > > I hear you I was thinking that this might not be a good idea because > NodeWatcher can reconnect you to the ensemble if it goes offline. > > I'm not sure if it's a bug or not but once my session expired on the client > it wouldn't reconnect so I just implemented my own reconnect and session > expiry. > > >> >> 2) Use the async API in watchNodeData and watchNodeExists. it simplifies >> the code and the error handling. > > > The problem was that according to feedback here an async request might never > return if the server dies shortly after the request and before it has a > change to respond. > > I wanted NodeWatcher to hide as much rope as possible. > > >> 3) You don't need to do a connect() in handleDisconnected(). ZooKeeper >> object will do it automatically for you. >> >> > I can try again if you'd like by this isn't my experience. Once the session > expired and the whole ensemble was offline it wouldn't connect again. > > If it was a transient disconnect I'd see on disconnect event and then a > quick reconnect. If it was a long disconnect (with nothing to attach to) > then ZK won't ever reconnect me. > > I'd like this to be the behavior though... > > >> There is an old example on sourceforge >> http://zookeeper.wiki.sourceforge.net/ZooKeeperJavaExample that may give >> you some more ideas on how to simplify your code. >> > > That would be nice simple is good! > > Kevin > > > -- > Founder/CEO Spinn3r.com > Location: San Francisco, CA > AIM/YIM: sfburtonator > Skype: burtonator > Work: http://spinn3r.com >