Session expiration caused by time change

2010-08-18 Thread Qing Yan
Hi, The testcase is fairly simple. We have a client which connects to ZK, registers an ephemeral node and watches on it. Now change the client machine's time - session killed.. Here is the log: *2010-08-18 04:24:57,782 INFO com.taobao.timetunnel2.cluster.service.AgentService: Host name

Re: Weird ephemeral node issue

2010-08-17 Thread Qing Yan
shouldn't assume that the ephemerals will disappear at the same time as the session expiration event is delivered. On Mon, Aug 16, 2010 at 8:31 PM, Qing Yan qing...@gmail.com wrote: Ouch, is this the current ZK behavior? This is unexpected, if the client get partitioned from ZK cluster, he

Re: Weird ephemeral node issue

2010-08-17 Thread Qing Yan
Forget to mention: the process looks fine, nomal memory foot print and cpu usage, generate expected results, only thing is missing the ephermenal node in ZK.

Re: Weird ephemeral node issue

2010-08-17 Thread Qing Yan
is restarting (run zkCli.sh on both machines). I am curious to see if the node that did not receive the SESSION_EXPIRED event still has the znode in its database. Also can you describe your setiup? Can you send out logs and zoo.cfg file. Thanks. -Vishal On Tue, Aug 17, 2010 at 3:31 AM, Qing Yan

Re: Weird ephemeral node issue

2010-08-17 Thread Qing Yan
Thanks for the explaination! I suggest this goes to the wiki.. quote the client only finds out about session expiration events when the client reconnects to the cluster. if zk tells a client that its session is expired, the ephemerals that correspond to that session will already be cleaned up. -

Re: Weird ephemeral node issue

2010-08-16 Thread Qing Yan
. Very likely you are having GC problems. Turn on verbose GC logging and see what is happening.  You may also want to change the session timeout values. It is very common for the use of ZK to highlight problems that you didn't know that you had. On Sun, Aug 15, 2010 at 8:51 PM, Qing Yan

Re: Weird ephemeral node issue

2010-08-16 Thread Qing Yan
reconnect you won't get the notification.  Personally I think the client api should track the session expiration time locally and information you once it's expired. On Aug 16, 2010 2:09 AM, Qing Yan qing...@gmail.com wrote: Hi Ted,  Do you mean GC problem can prevent delivery of SESSION EXPIRE

Weird ephemeral node issue

2010-08-15 Thread Qing Yan
We started using ZK in production recently and run into some problems. The user case is simple we have a central monitor checks the ephermenal nodes created by distributed apps, if the node dissappear, corresponding app will get restarted. Each app will also handle SESSION_EXPIRE by shutting

Re: persistent storage and node recovery

2010-03-16 Thread Qing Yan
I think these two models serve different purposes, ZK emphasis on synchronization(on a small dataset), DHT is about scaling, they can compliment each other nicely,e.g. you can have DHT scatter around to achieve scalability while ZK sits in the core to handle the minimal/necessary synchronization.

Re: how to handle re-add watch fails

2010-02-01 Thread Qing Yan
Take a look at the Lock/ProtocolSupport stuff under the sample code directory. Just build a layer on top of ZK API that encapsulate the calling details/centralize error handling logic..some of the common logic could be moved to ZK client library in the future, application won't need to worry about

Re: Q about ZK internal: how commit is being remembered

2010-01-28 Thread Qing Yan
that message reordering is possible (see all the stuff in that paper about non-deterministically drawing messages from a potentially deliverable set). TCP FIFO channels don't reorder, so provide the extra signalling that ZAB requires. cheers, Henry 2010/1/26 Qing Yan qing...@gmail.com

Re: Using zookeeper to assign a bunch of long-running tasks to nodes (without unhandled tasks and double-handled tasks)

2010-01-25 Thread Qing Yan
I agree, masterless is ideal but it is against KISS somehow About error handling, does ZK-22 means disconnection will be eliminated from API and will be solely handled by ZK implementation? I am not sure it is such a good idea though. Application layer need to be notified that communication

ZAB kick Paxos butt?

2010-01-20 Thread Qing Yan
Hello, Anyone familer with Paxos protocol here? I was doing some comparision of ZAB vs Paxos... first of all, ZAB's FIFO based protocol is really cool! http://wiki.apache.org/hadoop/ZooKeeper/PaxosRun mentioned the inconsistency case for Paxos(the state change B depends upon A, but A was

Re: ZAB kick Paxos butt?

2010-01-20 Thread Qing Yan
Yeah, actually I have no doubts about Paxos protocol itself but rather the state machine implementation part (as described in Paxos made simple,section 3) where there could be multiple Paxos instances. shouldn't the Paxos instance execution be serialized in order to make the state machine