[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307449#comment-15307449
 ] 

Flavio Junqueira commented on ZOOKEEPER-2435:
---------------------------------------------

bq. I wonder why all the zookeeper Clients will receive a sync connected event 
after the ensemble recovers from the leader crash and just some zookeeper 
Clients will receive a sync connected event after the follower crash?

When the leader crashes, the remaining followers will go into leader election 
and will drop the existing connections to clients. The idea there is that 
clients have a chance to go look for another server that is either leading or 
following rather than being stuck with a server that is looking for a leader. 
Consequently, a leader crash affects all clients. It is different when a 
follower crashes, since it only affects the clients that were connected to it. 

> miss event when the leader stop
> -------------------------------
>
>                 Key: ZOOKEEPER-2435
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2435
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 3.4.6
>            Reporter: BourneHan
>            Priority: Minor
>         Attachments: zookeeper---221.out, zookeeper---222.out
>
>
> Hi All, 
> In my projects, I use three ZooKeeper server as an ensemble:
> zk1 as a follower on 192.168.25.221,
> zk2 as a follower on 192.168.25.222,
> zk3 as the leader on 192.168.25.223.
> My two programs using ZooKeepers C client run on 192.168.25.221 and 
> 192.168.25.222.
> When watched the ZOO_CONNECTED_STATE, my program will use the zookeeper to 
> obtain a lock do the following:
> 1. Create a ZOO_EPHEMERAL | ZOO_SEQUENCE node under '/Lock/'.
> 2. Call getChildren( ) on the '/Lock/' node.
> 3. If the pathname created in step 1 has the lowest sequence number suffix, 
> the program has the lock and do something,then release the lock simply delete 
> the node created in step 1.
> 4. The program calls exists() with the watch flag set on the lowest sequence 
> number node.
> 5. if exists( ) returns false, go to step 2. Otherwise, wait for a 
> notification(ZOO_DELETED_EVENT) for the pathname from the previous step 
> before going to step 2.
> When I stop a follower such as zk1/zk2, everything is ok, my programs on 
> 192.168.25.221 and 192.168.25.222 do its work orderly under the lock's 
> control.
> When I stop the leader such as zk3(I have restarted zk1/zk2), my program on 
> 192.168.25.221 got the lock and release it normally, and my program on 
> 192.168.25.222 detected existence of the node 
> created by the program on 192.168.25.221, but keep waiting and can't receive 
> the ZOO_DELETED_EVENT notification.
> Does anyone else see the same problem?
> PS:
> 1. The attachment is the log of the zookeeper on 192.168.25.221 and 
> 192.168.25.222 when I stop the leader on 192.168.25.223
> 2. Actually I have other more programs using ZooKeepers C client run on 
> 192.168.25.221, 192.168.25.222 and 192.168.25.223.
> 3. The system time on 192.168.25.221 is slower 1 minute and 33 seconds than 
> 192.168.25.222 and 192.168.25.223. so when I stop the leader, it's 2016-05-28 
> 22:33:34 on 192.168.25.221 and 2016-05-28 22:35:07 on 192.168.25.222. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to