Jie Huang created ZOOKEEPER-3774:
------------------------------------

             Summary: Close quorum socket asynchronously on the leader to avoid 
ping being blocked by long socket closing time
                 Key: ZOOKEEPER-3774
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3774
             Project: ZooKeeper
          Issue Type: Sub-task
          Components: server
            Reporter: Jie Huang
             Fix For: 3.7.0


In ZOOKEEPER-3574 we close the quorum sockets on followers asynchronously when 
a leader is partitioned away so the shutdown process will not be stalled by 
long socket closing time and the followers can quickly establish a new quorum 
to serve client requests.

We've found that the long socket closing time can cause trouble on the leader 
too when a follower is partitioned away if the partition is detected by 
PingLaggingDetector. When the ping thread detects partition, it tries to 
disconnect the follower. If the socket closing time is long, the ping thread 
will be blocked and no ping is sent to any follower--even the ones still 
connected to the leader--since the ping thread is responsible for sending pings 
to all followers. When followers don't receive pings, they don't send ping 
response. When the leader don't receive ping response, the sessions expire. 

To prevent good sessions from expiring, we need to close the socket 
asynchronously on the leader too.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to