[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591225#comment-16591225
 ] 

Fangmin Lv commented on ZOOKEEPER-3099:
---------------------------------------

[~jiangjiafu] please change the title to be explicitly about the network ACL 
not shutdown on leader. Because in shutdown case, the follower should detect 
the connection being reset immediately after it, so it won't wait sync timeout 
before starting a new leader election.

In this case, the quorum is unavailable for syncLimit * tickTIme (+ leader 
election/activation time), not the session timeout.

In ZK, improper syncLimit, tickTIme, session timeout could cause undesired 
behavior, for example, if syncLimit * tickTime is larger than session timeout, 
then the client will expire due to one server is slow due to full GC or network 
issue. You need to make the session timeout relatively larger than the 
syncLimit * tickTime, so the global sessions won't expire due to network 
partition/ACL like the scenario you mentioned here.

> ZooKeeper cluster is unavailable for session_timeout time when the leader 
> shutdown in a three-node environment.   
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3099
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3099
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: c client, java client
>    Affects Versions: 3.4.11, 3.5.4, 3.4.12, 3.4.13
>            Reporter: Jiafu Jiang
>            Priority: Major
>
>  
> The default readTimeout timeout of ZooKeeper client is 2/3 * session_time, 
> the default connectTimeout is session_time/hostProvider.size(). If the 
> ZooKeeper cluster has 3 nodes, then connectTimeout is 1/3 * session_time.
>  
> Supports we have three ZooKeeper servers: zk1, zk2, zk3 deployed. And zk3 is 
> now the leader. Client c1 is now connected to zk2(follower). Then we shutdown 
> the network of zk3(leader), the same time, client c1 begin to write some data 
> to ZooKeeper. After a (syncLimit * tick) timeout, zk2 will disconnect with 
> leader and begin a new election, and zk2 becomes the leader.
>  
> The write operation will not succeed due to the leader is shutdown. It will 
> take at most readTimeout time for c1 to discover the failure, and client c1 
> will try to choose another ZooKeeper server. Unfortunately, c1 may choose 
> zk3, which is unreachable now, then c1 will spend connectTimeout to find out 
> that zk3 is unused. Notice that readTimeout + connectTimeout = 
> sesstion_timeout in my case(three-node cluster).
>  
> Therefore, in this case, the ZooKeeper cluster is unavailable for session 
> timeout time when only one ZooKeeper server is shutdown.
>  
> I have some suggestions:
>  # The HostProvider used by ZooKeeper can be specified by an argument.
>  # readTimeout can also be specified in any way.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to