Re: Simpler ZooKeeper event interface....

2009-01-09 Thread Thomas Vinod Johnson




In the case of an active leader, L continues to send commands 
(whatever) to the followers. However a new leader L' has since been 
elected and is also sending commands to the followers. In this case 
it seems like either a) L should not send commands if it's not 
sync'd to the ensemble (and holds the leader token) or b) followers 
should not accept commands from non-leader (only accept from the 
current leader). a) seems the right way to go; if L is disconnected 
it should stop sending commands to the followers, if it's resync'd 
in time it can


Seems to make sense in this particular case (I had some other cases 
in mind that I'm not so sure about though)


Feel free to discuss...


The thought is not that well formed, so perhaps it does not warrant much 
discussion ... This is more a realization that as far as the leader 
election recipe goes, if *in general* one wants to guarantee not having 
multiple leaders at the same time, certain assumptions have to made 
about timely reception and processing of events. So naively, if I wanted 
to use the recipe to ensure that only one system owns an IP address at 
any given time, I think there would be no way to guarantee it without 
making some assumptions about timing. In retrospect, this should have 
been obvious. In practice it may be simple enough to work around these 
problems (I actually think now that in my case an 'at least once' queue 
is more appropriate). Any way, like I said half baked thoughts ..




Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Patrick Hunt

Vinod Johnson wrote:


 
I guess then I don't follow the leader election recipe. Is the 
following scenario possible in the leader election recipe:

1) Leader L is partitioned from the ensemble.
2) ZK servers expire its session.
3) Some other follower F now becomes a leader.
4) L and F form a split brain?

I had wrongly assumed that the session was like a lease in that it 
allowed the client and server to independently know that the session 
had expired by the use of the global clock. Wouldn't it make sense 
for the client lib to expire its local session handle and never reuse 
it?


Here's a good reason for each client to know it's session status 
(connected/disconnected/expired). Depending on the application, if L 
does not have a connected session to the ensemble it may need to be 
careful how it acts.


I'm trying to think though some cases...

In the case of passive leader the followers will look at zk and only 
send requests to the leader, so this seems fine (L no longer gets 
requests, it syncs to the ensemble at some point and finds it's 
session expired, it recovers as appropriate)


But depending on timing, couldn't the old leader still get a request 
from some follower who is lagging in terms of event receipt (or is 
disconnected - which brings up the question of dealing with 
disconnection at the follower)? Not sure how likely this is in practice 
... but I can't say I'm comfortable with all the theoretical 
possibilities at this point. In this case, a disconnected leader could 
play it safe and not accept new requests.


Yes, this is definitely a possibility. It takes time for a session to 
expire. If your leader dies the followers will continue to send requests 
until the session expires and they are notified. If a follower is on a 
server that's lagging the notification may be delayed... etc... However 
these types of cases probably have to be handled anyway; say the 
follower can talk to zk ensemble but not to the leader because of some 
network issue.


In the case of an active leader, L continues to send commands 
(whatever) to the followers. However a new leader L' has since been 
elected and is also sending commands to the followers. In this case it 
seems like either a) L should not send commands if it's not sync'd to 
the ensemble (and holds the leader token) or b) followers should not 
accept commands from non-leader (only accept from the current leader). 
a) seems the right way to go; if L is disconnected it should stop 
sending commands to the followers, if it's resync'd in time it can


Seems to make sense in this particular case (I had some other cases in 
mind that I'm not so sure about though)


Feel free to discuss...

start sending commands again, otw it's session will expire, a new 
leader L' elected and it will start sending commands to followers, 
eventually L will resync and notice that it is no longer the leader 
(and do whatever it takes to recover).


> Wouldn't it make sense for the
> client lib to expire its local session handle and never reuse it?

I would think that depends on how expensive it is to change leaders. 
It would be trivial for the client to close it's session and start a 
new one each time it's notified of a disconnect from the ensemble.


Perhaps that's good enough. An alternative would be to wait for the 
timeout period.


Patrick


RE: Simpler ZooKeeper event interface....

2009-01-07 Thread Benjamin Reed
when you shutdown the full ensemble the session isn't expired. when things come 
back up your session will still be active. (it would be bad if the zk service 
could not survive the bounce of an ensembel.)

you are way over thinking this and i fear you are not helping yourself with 
trying to second guess with timers. zookeeper is structured such it can be used 
as ground truth. trying to second guess will only bring you headache.

ben

From: burtona...@gmail.com [burtona...@gmail.com] On Behalf Of Kevin Burton 
[bur...@spinn3r.com]
Sent: Wednesday, January 07, 2009 3:36 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: Simpler ZooKeeper event interface

>
>
> Here's a good reason for each client to know it's session status
> (connected/disconnected/expired). Depending on the application, if L does
> not have a connected session to the ensemble it may need to be careful how
> it acts.
>

connected/disconnected events are given out in the current API but when I
shutdown the full ensemble I don't receive a session expired.

I'm considering implementing my own session expiration by tracking how long
I've been disconnected.

Kevin

--
Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com


Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Patrick Hunt

Kevin Burton wrote:


Here's a good reason for each client to know it's session status
(connected/disconnected/expired). Depending on the application, if L does
not have a connected session to the ensemble it may need to be careful how
it acts.



connected/disconnected events are given out in the current API but when I
shutdown the full ensemble I don't receive a session expired.


At the risk of being overly clear... ;-)

When you shutdown the full ensemble all clients of the ensemble will be 
disconnected from the ensemble. Regardless of each client's 
participation in the overall application architecture they should 
probably consider being disconnected a "bad thing" and act appropriately 
(granted this is highly dependent on the particular use cases...). For 
example followers won't know who the leader is... leaders won't know if 
they are still leaders... etc...



I'm considering implementing my own session expiration by tracking how long
I've been disconnected.


In the extreme you could just close the session each time you got 
disconnected.


Patrick



Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Vinod Johnson


 
I guess then I don't follow the leader election recipe. Is the 
following scenario possible in the leader election recipe:

1) Leader L is partitioned from the ensemble.
2) ZK servers expire its session.
3) Some other follower F now becomes a leader.
4) L and F form a split brain?

I had wrongly assumed that the session was like a lease in that it 
allowed the client and server to independently know that the session 
had expired by the use of the global clock. Wouldn't it make sense 
for the client lib to expire its local session handle and never reuse 
it?


Here's a good reason for each client to know it's session status 
(connected/disconnected/expired). Depending on the application, if L 
does not have a connected session to the ensemble it may need to be 
careful how it acts.


I'm trying to think though some cases...

In the case of passive leader the followers will look at zk and only 
send requests to the leader, so this seems fine (L no longer gets 
requests, it syncs to the ensemble at some point and finds it's 
session expired, it recovers as appropriate)


But depending on timing, couldn't the old leader still get a request 
from some follower who is lagging in terms of event receipt (or is 
disconnected - which brings up the question of dealing with 
disconnection at the follower)? Not sure how likely this is in practice 
... but I can't say I'm comfortable with all the theoretical 
possibilities at this point. In this case, a disconnected leader could 
play it safe and not accept new requests.
In the case of an active leader, L continues to send commands 
(whatever) to the followers. However a new leader L' has since been 
elected and is also sending commands to the followers. In this case it 
seems like either a) L should not send commands if it's not sync'd to 
the ensemble (and holds the leader token) or b) followers should not 
accept commands from non-leader (only accept from the current leader). 
a) seems the right way to go; if L is disconnected it should stop 
sending commands to the followers, if it's resync'd in time it can
Seems to make sense in this particular case (I had some other cases in 
mind that I'm not so sure about though)
start sending commands again, otw it's session will expire, a new 
leader L' elected and it will start sending commands to followers, 
eventually L will resync and notice that it is no longer the leader 
(and do whatever it takes to recover).


> Wouldn't it make sense for the
> client lib to expire its local session handle and never reuse it?

I would think that depends on how expensive it is to change leaders. 
It would be trivial for the client to close it's session and start a 
new one each time it's notified of a disconnect from the ensemble.


Perhaps that's good enough. An alternative would be to wait for the 
timeout period.

Patrick





Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Kevin Burton
>
>
> Here's a good reason for each client to know it's session status
> (connected/disconnected/expired). Depending on the application, if L does
> not have a connected session to the ensemble it may need to be careful how
> it acts.
>

connected/disconnected events are given out in the current API but when I
shutdown the full ensemble I don't receive a session expired.

I'm considering implementing my own session expiration by tracking how long
I've been disconnected.

Kevin

-- 
Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com


Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Patrick Hunt

Vinod Johnson wrote:

Mahadev Konar wrote:

Hi Vinod,
 I think what Ben meant was this--

The client will never know of a session expiration until and unless its
connected to one of the servers. So the leader cannot demote itself since
its connected to one of the servers. It might have lost its session 
(which
all the others except itself would have realized) but will have to 
wait to

demote itself until it connects to one of the servers.

  
I guess then I don't follow the leader election recipe. Is the following 
scenario possible in the leader election recipe:

1) Leader L is partitioned from the ensemble.
2) ZK servers expire its session.
3) Some other follower F now becomes a leader.
4) L and F form a split brain?

I had wrongly assumed that the session was like a lease in that it 
allowed the client and server to independently know that the session had 
expired by the use of the global clock. Wouldn't it make sense for the 
client lib to expire its local session handle and never reuse it?


Here's a good reason for each client to know it's session status 
(connected/disconnected/expired). Depending on the application, if L 
does not have a connected session to the ensemble it may need to be 
careful how it acts.


I'm trying to think though some cases...

In the case of passive leader the followers will look at zk and only 
send requests to the leader, so this seems fine (L no longer gets 
requests, it syncs to the ensemble at some point and finds it's session 
expired, it recovers as appropriate)


In the case of an active leader, L continues to send commands (whatever) 
to the followers. However a new leader L' has since been elected and is 
also sending commands to the followers. In this case it seems like 
either a) L should not send commands if it's not sync'd to the ensemble 
(and holds the leader token) or b) followers should not accept commands 
from non-leader (only accept from the current leader). a) seems the 
right way to go; if L is disconnected it should stop sending commands to 
the followers, if it's resync'd in time it can start sending commands 
again, otw it's session will expire, a new leader L' elected and it will 
start sending commands to followers, eventually L will resync and notice 
that it is no longer the leader (and do whatever it takes to recover).


> Wouldn't it make sense for the
> client lib to expire its local session handle and never reuse it?

I would think that depends on how expensive it is to change leaders. It 
would be trivial for the client to close it's session and start a new 
one each time it's notified of a disconnect from the ensemble.


Patrick


mahadev


On 1/7/09 10:02 AM, "Vinod Johnson"  wrote:

 

Benjamin Reed wrote:
   
You don't demote yourself on disconnect. (Everyone else may still 
believe you
are the leader.) Check out the "Things to Remember about Watches" 
section in

the programmer's guide.

When you are disconnected from ZK you don't know what is happening, 
so you
have to act conservatively. Your session may or may not have 
expired. You

will not know for sure until you reconnect to ZK.


Just to make sure I'm not misunderstanding the last bit, even without
reconnecting to ZK, the leader's session could expire at the client
side, correct? In that case the conservative thing for the leader to do
is to demote itself if the intent is to avoid split brain (even though
the session may still be active at ZK for some period of time after 
this).



  




Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Patrick Hunt
To say that it will "never return" is not correct. The client will be 
notified of "connectionloss" in the callback, however the client will 
not know if the operation was successful (from point of view of the 
server) or not.


Patrick

Kevin Burton wrote:

On Wed, Jan 7, 2009 at 11:12 AM, Mahadev Konar wrote:


You are right Pat. Replaying an async operation would involve a lot of
state
management for clients across servers and would involve a lot more work in
determining which operation succeeded and the one which needs to be re run
and the semantics of zookeeper client calls would be much harder to
guarantee.



OK. so maybe a sane middle ground would be to put a warning in the code
that an async operation might never return.
I think a generalization about ZK right now (at least based on my current
perspective) is that it makes it too easy to run with scissors.

ZK probably will work fine with all servers in an ensemble connected but if
one goes away you need to be VERY careful about how you code your app.

Kevin



Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Vinod Johnson

Mahadev Konar wrote:

Hi Vinod,
 I think what Ben meant was this--

The client will never know of a session expiration until and unless its
connected to one of the servers. So the leader cannot demote itself since
its connected to one of the servers. It might have lost its session (which
all the others except itself would have realized) but will have to wait to
demote itself until it connects to one of the servers.

  
I guess then I don't follow the leader election recipe. Is the following 
scenario possible in the leader election recipe:

1) Leader L is partitioned from the ensemble.
2) ZK servers expire its session.
3) Some other follower F now becomes a leader.
4) L and F form a split brain?

I had wrongly assumed that the session was like a lease in that it 
allowed the client and server to independently know that the session had 
expired by the use of the global clock. Wouldn't it make sense for the 
client lib to expire its local session handle and never reuse it?

mahadev


On 1/7/09 10:02 AM, "Vinod Johnson"  wrote:

  

Benjamin Reed wrote:


You don't demote yourself on disconnect. (Everyone else may still believe you
are the leader.) Check out the "Things to Remember about Watches" section in
the programmer's guide.

When you are disconnected from ZK you don't know what is happening, so you
have to act conservatively. Your session may or may not have expired. You
will not know for sure until you reconnect to ZK.
  
  

Just to make sure I'm not misunderstanding the last bit, even without
reconnecting to ZK, the leader's session could expire at the client
side, correct? In that case the conservative thing for the leader to do
is to demote itself if the intent is to avoid split brain (even though
the session may still be active at ZK for some period of time after this).



  




Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Kevin Burton
On Wed, Jan 7, 2009 at 11:12 AM, Mahadev Konar wrote:

> You are right Pat. Replaying an async operation would involve a lot of
> state
> management for clients across servers and would involve a lot more work in
> determining which operation succeeded and the one which needs to be re run
> and the semantics of zookeeper client calls would be much harder to
> guarantee.
>

OK. so maybe a sane middle ground would be to put a warning in the code
that an async operation might never return.
I think a generalization about ZK right now (at least based on my current
perspective) is that it makes it too easy to run with scissors.

ZK probably will work fine with all servers in an ensemble connected but if
one goes away you need to be VERY careful about how you code your app.

Kevin

-- 
Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com


Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Kevin Burton
>
> My assumption is the following - if I'm connected to multiple ZK servers,
> then assuming the leader election recipe given in the docs, the leader needs
> to only demote itself if its session expires. Of course, doing the same on
> disconnect does not violate safety, it just seems too pessimistic. In the
> case of followers, a disconnected event would mean that they will have to
> wait for reconnection before being able to determine if one of them should
> in fact be the leader.
>

That was my fault it's too pessimistic.
A 1ms disconnect doesn't mean you should not be the leader, especially if
becoming a leader is very expensive in terms of computational resources.

Kevin

-- 
Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com


Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Mahadev Konar
Hi Vinod,
 I think what Ben meant was this--

The client will never know of a session expiration until and unless its
connected to one of the servers. So the leader cannot demote itself since
its connected to one of the servers. It might have lost its session (which
all the others except itself would have realized) but will have to wait to
demote itself until it connects to one of the servers.

mahadev


On 1/7/09 10:02 AM, "Vinod Johnson"  wrote:

> Benjamin Reed wrote:
>> You don't demote yourself on disconnect. (Everyone else may still believe you
>> are the leader.) Check out the "Things to Remember about Watches" section in
>> the programmer's guide.
>> 
>> When you are disconnected from ZK you don't know what is happening, so you
>> have to act conservatively. Your session may or may not have expired. You
>> will not know for sure until you reconnect to ZK.
>>   
> Just to make sure I'm not misunderstanding the last bit, even without
> reconnecting to ZK, the leader's session could expire at the client
> side, correct? In that case the conservative thing for the leader to do
> is to demote itself if the intent is to avoid split brain (even though
> the session may still be active at ZK for some period of time after this).



Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Mahadev Konar
You are right Pat. Replaying an async operation would involve a lot of state
management for clients across servers and would involve a lot more work in
determining which operation succeeded and the one which needs to be re run
and the semantics of zookeeper client calls would be much harder to
guarantee.

mahadev


On 1/7/09 10:33 AM, "Patrick Hunt"  wrote:

> Kevin Burton wrote:
>>> 3) it's possible for your code to get notified of a change, but never
>>> process the change. This might happen if:
>>>  a) a node changed watch fires
>>>  b) your client code runs an async getData
>>>  c) you are disconnected from the server
>>> 
>> 
>> Also, this seems very confusing...
>> 
>> If I run an async request, the client should replay these if I'm reconnected
>> to another host.
> 
> (Ben/Flavio/Mahadev can correct me if I'm wrong here or missed some detail)
> 
> Async operations are tricky as the server makes the change when it gets
> the request, not when the client processes the response. So you could
> request an async operation, which the server could process and respond
> to the client, immed. after which the client is disconnected from the
> server (before it can process the response). Client replay would not
> work in this case, and given that async is typically used for high
> throughput situations there could be a number of operations effected.
> 
> Patrick



Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Patrick Hunt

Kevin Burton wrote:

3) it's possible for your code to get notified of a change, but never
process the change. This might happen if:
 a) a node changed watch fires
 b) your client code runs an async getData
 c) you are disconnected from the server



Also, this seems very confusing...

If I run an async request, the client should replay these if I'm reconnected
to another host.


(Ben/Flavio/Mahadev can correct me if I'm wrong here or missed some detail)

Async operations are tricky as the server makes the change when it gets 
the request, not when the client processes the response. So you could 
request an async operation, which the server could process and respond 
to the client, immed. after which the client is disconnected from the 
server (before it can process the response). Client replay would not 
work in this case, and given that async is typically used for high 
throughput situations there could be a number of operations effected.


Patrick


Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Vinod Johnson

Benjamin Reed wrote:

You don't demote yourself on disconnect. (Everyone else may still believe you are the 
leader.) Check out the "Things to Remember about Watches" section in the 
programmer's guide.

When you are disconnected from ZK you don't know what is happening, so you have 
to act conservatively. Your session may or may not have expired. You will not 
know for sure until you reconnect to ZK.
  
Just to make sure I'm not misunderstanding the last bit, even without 
reconnecting to ZK, the leader's session could expire at the client 
side, correct? In that case the conservative thing for the leader to do 
is to demote itself if the intent is to avoid split brain (even though 
the session may still be active at ZK for some period of time after this).


Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Patrick Hunt
Btw, not sure if you've looked at this but your code is similar to one 
of the examples in the docs:


http://hadoop.apache.org/zookeeper/docs/r3.0.1/javaExample.html

Notice that the listener is ignoring anything other than session expiration.

Patrick

Patrick Hunt wrote:

Take a look at the leader election recipe here:
http://hadoop.apache.org/zookeeper/docs/r3.0.1/recipes.html#sc_leaderElection 



Also a very simple version of leader election is detailed in the preso:
http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperPresentations

If the leader is disconnected it can't give up leadership - as it's not 
connected to the cluster it can't make changes to the znodes. That's why 
the recipe uses ephemeral nodes, so that if the cluster _expires_ the 
leaders session (doesn't hear from the client w/in the timeout specified 
by that particular client during session establishment, say the network 
btw the leader/cluster fails) the leader znode will be removed, the 
followers notified, and a new leader elected.


If the leader is disconnected for a short period, then reconnected, this 
may be a non issue from an application (your app) perspective, or it may 
be something that is important to the application as a whole, it's up to 
each implementation whether this is important or just ignored.


Patrick

Hiram Chirino wrote:

Knowing about a disconnection may be important to some apps.  For
example if an app uses ZK for leader election, and the leader gets
disconnected from ZK, he should give up being the leader, since a
different leader may get elected while he is disconnected from ZK.

On Tue, Jan 6, 2009 at 11:58 PM, Kevin Burton  wrote:

3.0.1.
my watches get recreated on the new server but I'm still too aware of
connections.

In fact, shouldn't disconnect be removed entirely?  Or is this just 
advice

telling the client that something bad might have happened?

Kevin

On Tue, Jan 6, 2009 at 7:12 PM, Mahadev Konar  
wrote:



http://issues.apache.org/jira/browse/ZOOKEEPER-23

This has been fixed in zookeeper-3.0 release. Are you using a 
release from

sourceforge?


mahadev


On 1/6/09 4:57 PM, "Kevin Burton"  wrote:

This could be simplified if the semantics for reconnect were 
simplified.

Is there any reason why I should know about a disconnect if ZK is just

going

to reconnect me to another server in 1ms?

Why not hide *all* of this form the user and have the client re-issue
watches on reconnect and hold off on throwing exceptions if the server
returns.

This would allow the user to just handle three conditions... total

ensemble

failure, no ACL permission, or no node existing (of vice-versa).

Kevin



If I run an async request, the client should replay these if I'm
reconnected to another host.

--

Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com




--
Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com







Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Patrick Hunt

Take a look at the leader election recipe here:
http://hadoop.apache.org/zookeeper/docs/r3.0.1/recipes.html#sc_leaderElection

Also a very simple version of leader election is detailed in the preso:
http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperPresentations

If the leader is disconnected it can't give up leadership - as it's not 
connected to the cluster it can't make changes to the znodes. That's why 
the recipe uses ephemeral nodes, so that if the cluster _expires_ the 
leaders session (doesn't hear from the client w/in the timeout specified 
by that particular client during session establishment, say the network 
btw the leader/cluster fails) the leader znode will be removed, the 
followers notified, and a new leader elected.


If the leader is disconnected for a short period, then reconnected, this 
may be a non issue from an application (your app) perspective, or it may 
be something that is important to the application as a whole, it's up to 
each implementation whether this is important or just ignored.


Patrick

Hiram Chirino wrote:

Knowing about a disconnection may be important to some apps.  For
example if an app uses ZK for leader election, and the leader gets
disconnected from ZK, he should give up being the leader, since a
different leader may get elected while he is disconnected from ZK.

On Tue, Jan 6, 2009 at 11:58 PM, Kevin Burton  wrote:

3.0.1.
my watches get recreated on the new server but I'm still too aware of
connections.

In fact, shouldn't disconnect be removed entirely?  Or is this just advice
telling the client that something bad might have happened?

Kevin

On Tue, Jan 6, 2009 at 7:12 PM, Mahadev Konar  wrote:


http://issues.apache.org/jira/browse/ZOOKEEPER-23

This has been fixed in zookeeper-3.0 release. Are you using a release from
sourceforge?


mahadev


On 1/6/09 4:57 PM, "Kevin Burton"  wrote:


This could be simplified if the semantics for reconnect were simplified.
Is there any reason why I should know about a disconnect if ZK is just

going

to reconnect me to another server in 1ms?

Why not hide *all* of this form the user and have the client re-issue
watches on reconnect and hold off on throwing exceptions if the server
returns.

This would allow the user to just handle three conditions... total

ensemble

failure, no ACL permission, or no node existing (of vice-versa).

Kevin



If I run an async request, the client should replay these if I'm
reconnected to another host.

--

Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com




--
Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com







RE: Simpler ZooKeeper event interface....

2009-01-07 Thread Benjamin Reed
You don't demote yourself on disconnect. (Everyone else may still believe you 
are the leader.) Check out the "Things to Remember about Watches" section in 
the programmer's guide.

When you are disconnected from ZK you don't know what is happening, so you have 
to act conservatively. Your session may or may not have expired. You will not 
know for sure until you reconnect to ZK.

ben

From: thomas.john...@sun.com [thomas.john...@sun.com]
Sent: Wednesday, January 07, 2009 8:24 AM
To: zookeeper-user@hadoop.apache.org
Subject: Re: Simpler ZooKeeper event interface

Hiram Chirino wrote:
> Knowing about a disconnection may be important to some apps.  For
> example if an app uses ZK for leader election, and the leader gets
> disconnected from ZK, he should give up being the leader, since a
> different leader may get elected while he is disconnected from ZK.
>
>
Is that necessarily true?

My assumption is the following - if I'm connected to multiple ZK
servers, then assuming the leader election recipe given in the docs, the
leader needs to only demote itself if its session expires. Of course,
doing the same on disconnect does not violate safety, it just seems too
pessimistic. In the case of followers, a disconnected event would mean
that they will have to wait for reconnection before being able to
determine if one of them should in fact be the leader.


Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Vinod Johnson

Hiram Chirino wrote:

Knowing about a disconnection may be important to some apps.  For
example if an app uses ZK for leader election, and the leader gets
disconnected from ZK, he should give up being the leader, since a
different leader may get elected while he is disconnected from ZK.

  

Is that necessarily true?

My assumption is the following - if I'm connected to multiple ZK 
servers, then assuming the leader election recipe given in the docs, the 
leader needs to only demote itself if its session expires. Of course, 
doing the same on disconnect does not violate safety, it just seems too 
pessimistic. In the case of followers, a disconnected event would mean 
that they will have to wait for reconnection before being able to 
determine if one of them should in fact be the leader.


Re: Simpler ZooKeeper event interface....

2009-01-07 Thread Hiram Chirino
Knowing about a disconnection may be important to some apps.  For
example if an app uses ZK for leader election, and the leader gets
disconnected from ZK, he should give up being the leader, since a
different leader may get elected while he is disconnected from ZK.

On Tue, Jan 6, 2009 at 11:58 PM, Kevin Burton  wrote:
> 3.0.1.
> my watches get recreated on the new server but I'm still too aware of
> connections.
>
> In fact, shouldn't disconnect be removed entirely?  Or is this just advice
> telling the client that something bad might have happened?
>
> Kevin
>
> On Tue, Jan 6, 2009 at 7:12 PM, Mahadev Konar  wrote:
>
>> http://issues.apache.org/jira/browse/ZOOKEEPER-23
>>
>> This has been fixed in zookeeper-3.0 release. Are you using a release from
>> sourceforge?
>>
>>
>> mahadev
>>
>>
>> On 1/6/09 4:57 PM, "Kevin Burton"  wrote:
>>
>> > This could be simplified if the semantics for reconnect were simplified.
>> > Is there any reason why I should know about a disconnect if ZK is just
>> going
>> > to reconnect me to another server in 1ms?
>> >
>> > Why not hide *all* of this form the user and have the client re-issue
>> > watches on reconnect and hold off on throwing exceptions if the server
>> > returns.
>> >
>> > This would allow the user to just handle three conditions... total
>> ensemble
>> > failure, no ACL permission, or no node existing (of vice-versa).
>> >
>> > Kevin
>> >
>> >
>> >> If I run an async request, the client should replay these if I'm
>> >> reconnected to another host.
>> >>
>> >> --
>> > Founder/CEO Spinn3r.com
>> > Location: San Francisco, CA
>> > AIM/YIM: sfburtonator
>> > Skype: burtonator
>> > Work: http://spinn3r.com
>>
>>
>
>
> --
> Founder/CEO Spinn3r.com
> Location: San Francisco, CA
> AIM/YIM: sfburtonator
> Skype: burtonator
> Work: http://spinn3r.com
>



-- 
Regards,
Hiram

Blog: http://hiramchirino.com

Open Source SOA
http://open.iona.com


Re: Simpler ZooKeeper event interface....

2009-01-06 Thread Kevin Burton
3.0.1.
my watches get recreated on the new server but I'm still too aware of
connections.

In fact, shouldn't disconnect be removed entirely?  Or is this just advice
telling the client that something bad might have happened?

Kevin

On Tue, Jan 6, 2009 at 7:12 PM, Mahadev Konar  wrote:

> http://issues.apache.org/jira/browse/ZOOKEEPER-23
>
> This has been fixed in zookeeper-3.0 release. Are you using a release from
> sourceforge?
>
>
> mahadev
>
>
> On 1/6/09 4:57 PM, "Kevin Burton"  wrote:
>
> > This could be simplified if the semantics for reconnect were simplified.
> > Is there any reason why I should know about a disconnect if ZK is just
> going
> > to reconnect me to another server in 1ms?
> >
> > Why not hide *all* of this form the user and have the client re-issue
> > watches on reconnect and hold off on throwing exceptions if the server
> > returns.
> >
> > This would allow the user to just handle three conditions... total
> ensemble
> > failure, no ACL permission, or no node existing (of vice-versa).
> >
> > Kevin
> >
> >
> >> If I run an async request, the client should replay these if I'm
> >> reconnected to another host.
> >>
> >> --
> > Founder/CEO Spinn3r.com
> > Location: San Francisco, CA
> > AIM/YIM: sfburtonator
> > Skype: burtonator
> > Work: http://spinn3r.com
>
>


-- 
Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com


Re: Simpler ZooKeeper event interface....

2009-01-06 Thread Mahadev Konar
Does javadoc help?  :)

Mahadev


On 1/6/09 4:10 PM, "Kevin Burton"  wrote:

>> 
>> 
>>  zk.getData( event.getPath(), true, this, null );
>> 
>>> 
>>> 
> Also, why not rename this getDataAsync  I can't tell the difference just
> by looking at the method and the different number of arguments.
> Should make things a bit more straight forward.
> 
> Kevin



Re: Simpler ZooKeeper event interface....

2009-01-06 Thread Mahadev Konar
http://issues.apache.org/jira/browse/ZOOKEEPER-23

This has been fixed in zookeeper-3.0 release. Are you using a release from
sourceforge?


mahadev


On 1/6/09 4:57 PM, "Kevin Burton"  wrote:

> This could be simplified if the semantics for reconnect were simplified.
> Is there any reason why I should know about a disconnect if ZK is just going
> to reconnect me to another server in 1ms?
> 
> Why not hide *all* of this form the user and have the client re-issue
> watches on reconnect and hold off on throwing exceptions if the server
> returns.
> 
> This would allow the user to just handle three conditions... total ensemble
> failure, no ACL permission, or no node existing (of vice-versa).
> 
> Kevin
> 
> 
>> If I run an async request, the client should replay these if I'm
>> reconnected to another host.
>> 
>> --
> Founder/CEO Spinn3r.com
> Location: San Francisco, CA
> AIM/YIM: sfburtonator
> Skype: burtonator
> Work: http://spinn3r.com



Re: Simpler ZooKeeper event interface....

2009-01-06 Thread Kevin Burton
This could be simplified if the semantics for reconnect were simplified.
Is there any reason why I should know about a disconnect if ZK is just going
to reconnect me to another server in 1ms?

Why not hide *all* of this form the user and have the client re-issue
watches on reconnect and hold off on throwing exceptions if the server
returns.

This would allow the user to just handle three conditions... total ensemble
failure, no ACL permission, or no node existing (of vice-versa).

Kevin


> If I run an async request, the client should replay these if I'm
> reconnected to another host.
>
> --
Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com


Re: Simpler ZooKeeper event interface....

2009-01-06 Thread Kevin Burton
>
> 3) it's possible for your code to get notified of a change, but never
> process the change. This might happen if:
>  a) a node changed watch fires
>  b) your client code runs an async getData
>  c) you are disconnected from the server
>

Also, this seems very confusing...

If I run an async request, the client should replay these if I'm reconnected
to another host.

Kevin

-- 
Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com


Re: Simpler ZooKeeper event interface....

2009-01-06 Thread Kevin Burton
>
>
>  zk.getData( event.getPath(), true, this, null );
>
>>
>>
Also, why not rename this getDataAsync  I can't tell the difference just
by looking at the method and the different number of arguments.
Should make things a bit more straight forward.

Kevin

-- 
Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com


Re: Simpler ZooKeeper event interface....

2009-01-06 Thread Kevin Burton
> 1) you are ignoring the result codes in the callbacks, this could get you
> into trouble (say you do a getData on a node that has been deleted ie
> someone changes then immed. deletes the node)
>

Actually I think I removed that FIXME.

I'll try to fix this now..

Another issue was that the ACL could change.

What happens when you have a watch and the ACL changes and it's updated?

The question is how do I handle this.  Do I have an ACL change exception or
just start returning onData when the node is available again.

This is another reason I wanted to post this as might be a good contrib that
other people could use.



> 2) I'm confused by one of your comments, you mention:
>  //read the current value.  NOTE that this could easily be a blocking
>  //read here but we might as well have one code path.
>  zk.getData( event.getPath(), true, this, null );
>
> however you are using an async API call, what blocking are you referring
> to? If you are ok w/blocking use the synchronous API, your code would be
> simpler (no callbacks!)


Oh that was my point. I'm already using callbacks so I an reduce code
this way.  :)


> 3) it's possible for your code to get notified of a change, but never
> process the change. This might happen if:
>  a) a node changed watch fires
>  b) your client code runs an async getData
>  c) you are disconnected from the server
>

Ah interesting.  So the async getData would never return.  Shouldn't the
client try to reconnect to another server and re-read?
Easy enough to fix this I think.

Kevin

-- 
Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com


Re: Simpler ZooKeeper event interface....

2009-01-06 Thread Patrick Hunt
I should have been more clear on 3c - in this case you will get notified 
in the callback of CONNECTIONLOSS for any pending async requests, but as 
you are ignoring rc it may cause problems.


Patrick

Patrick Hunt wrote:

Hi Kevin, a couple of issues I noticed while looking at the pastebin:

1) you are ignoring the result codes in the callbacks, this could get 
you into trouble (say you do a getData on a node that has been deleted 
ie someone changes then immed. deletes the node)


2) I'm confused by one of your comments, you mention:
  //read the current value.  NOTE that this could easily be a blocking
  //read here but we might as well have one code path.
  zk.getData( event.getPath(), true, this, null );

however you are using an async API call, what blocking are you referring 
to? If you are ok w/blocking use the synchronous API, your code would be 
simpler (no callbacks!)


3) it's possible for your code to get notified of a change, but never 
process the change. This might happen if:

 a) a node changed watch fires
 b) your client code runs an async getData
 c) you are disconnected from the server

Patrick

Kevin Burton wrote:

Hey guys.

I think I'm finally in the position to push ZK into production for a 
while

to test it out.

My biggest feedback (other than the small bugs I found) was that the API
could be a bit simpler.

I codified my thoughts here:

http://pastebin.com/f2ecea8c7
http://pastebin.com/f62a01e9

Basically, I was thinking that one could receive an onData event to 
receive

the initial value.

Then all future events would call onData.

I was thinking that an onExists() method might also be nice.

The current API could be made cleaner with:

 - one or two standalone Listener interfaces with onFoo methods for each
event type.
 - the processResults() method is the same for each interface right now
which is somewhat confusing.  Using onFoo is more self documenting.
 - using the main thread by using poll() to wait for events from 
ZooKeeper.

I use a ConcurrentLinkedQueue in my implementation.

Also, is there a race condition between when the client receives an event
for an update and before it can request a new one?  I was thinking 
session

local based events would solve this problem (you register your watch once
per session and then get all events until it is unregistered).

I think this can be solved in my code by reading the current version 
of the

value from the getData() method I call when I register the new watch and
comparing it to the last version I saw. If it was incremented then I 
would

call onData again.

The problem here is that I might miss two updates (but at least I would
receive the last stable value).

Kevin



Re: Simpler ZooKeeper event interface....

2009-01-06 Thread Patrick Hunt

Hi Kevin, a couple of issues I noticed while looking at the pastebin:

1) you are ignoring the result codes in the callbacks, this could get 
you into trouble (say you do a getData on a node that has been deleted 
ie someone changes then immed. deletes the node)


2) I'm confused by one of your comments, you mention:
  //read the current value.  NOTE that this could easily be a blocking
  //read here but we might as well have one code path.
  zk.getData( event.getPath(), true, this, null );

however you are using an async API call, what blocking are you referring 
to? If you are ok w/blocking use the synchronous API, your code would be 
simpler (no callbacks!)


3) it's possible for your code to get notified of a change, but never 
process the change. This might happen if:

 a) a node changed watch fires
 b) your client code runs an async getData
 c) you are disconnected from the server

Patrick

Kevin Burton wrote:

Hey guys.

I think I'm finally in the position to push ZK into production for a while
to test it out.

My biggest feedback (other than the small bugs I found) was that the API
could be a bit simpler.

I codified my thoughts here:

http://pastebin.com/f2ecea8c7
http://pastebin.com/f62a01e9

Basically, I was thinking that one could receive an onData event to receive
the initial value.

Then all future events would call onData.

I was thinking that an onExists() method might also be nice.

The current API could be made cleaner with:

 - one or two standalone Listener interfaces with onFoo methods for each
event type.
 - the processResults() method is the same for each interface right now
which is somewhat confusing.  Using onFoo is more self documenting.
 - using the main thread by using poll() to wait for events from ZooKeeper.
I use a ConcurrentLinkedQueue in my implementation.

Also, is there a race condition between when the client receives an event
for an update and before it can request a new one?  I was thinking session
local based events would solve this problem (you register your watch once
per session and then get all events until it is unregistered).

I think this can be solved in my code by reading the current version of the
value from the getData() method I call when I register the new watch and
comparing it to the last version I saw. If it was incremented then I would
call onData again.

The problem here is that I might miss two updates (but at least I would
receive the last stable value).

Kevin



Simpler ZooKeeper event interface....

2009-01-06 Thread Kevin Burton
Hey guys.

I think I'm finally in the position to push ZK into production for a while
to test it out.

My biggest feedback (other than the small bugs I found) was that the API
could be a bit simpler.

I codified my thoughts here:

http://pastebin.com/f2ecea8c7
http://pastebin.com/f62a01e9

Basically, I was thinking that one could receive an onData event to receive
the initial value.

Then all future events would call onData.

I was thinking that an onExists() method might also be nice.

The current API could be made cleaner with:

 - one or two standalone Listener interfaces with onFoo methods for each
event type.
 - the processResults() method is the same for each interface right now
which is somewhat confusing.  Using onFoo is more self documenting.
 - using the main thread by using poll() to wait for events from ZooKeeper.
I use a ConcurrentLinkedQueue in my implementation.

Also, is there a race condition between when the client receives an event
for an update and before it can request a new one?  I was thinking session
local based events would solve this problem (you register your watch once
per session and then get all events until it is unregistered).

I think this can be solved in my code by reading the current version of the
value from the getData() method I call when I register the new watch and
comparing it to the last version I saw. If it was incremented then I would
call onData again.

The problem here is that I might miss two updates (but at least I would
receive the last stable value).

Kevin

-- 
Founder/CEO Spinn3r.com
Location: San Francisco, CA
AIM/YIM: sfburtonator
Skype: burtonator
Work: http://spinn3r.com