Pierre Yin created ZOOKEEPER-3706:
-------------------------------------

             Summary: ZooKeeper.close() would leak SendThread when the network 
is broken
                 Key: ZOOKEEPER-3706
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3706
             Project: ZooKeeper
          Issue Type: Bug
          Components: java client
    Affects Versions: 3.5.6, 3.4.14, 3.6.0
            Reporter: Pierre Yin


The close method of ZooKeeper may cause the leak of SendThread when the network 
is broken.

When the network is broken, the SendThread of ZooKeeper client falls into the 
continuous reconnecting scenario. But there is an unsafe point which is just at 
the moment before startConnect() during the continuous reconnecting. If 
SendThread.close() in another thread hit the unsafe point, startConnect() would 
sleep some time and force to change state to States.CONNECTING although 
SendThread.close() already set state to States.CLOSED. In this case, the 
SendThread would be never be dead and nobody would change the state again.

In normal case, ZooKeeper.close() would be blocked forever to wait closeSession 
packet is finished until the network broken is recovered. But if user set the 
request timeout, ZooKeeper.close() would break the block waiting within timeout 
and invoke SendThread.close() to change state to CLOSED. That's why 
SendThread.close() can hit the unsafe point.

Set request timeout is a very common practice. 

I propose a patch and send it out later.

Maybe someone can help to review it.

 

Thanks

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to