[jira] [Commented] (ZOOKEEPER-3706) ZooKeeper.close() would leak SendThread when the network is broken

Aishwarya Soni (Jira) Mon, 08 Jun 2020 16:32:00 -0700


    [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128712#comment-17128712
 ]


Aishwarya Soni commented on ZOOKEEPER-3706:
-------------------------------------------

[~yfx416] [~hanm] I think I am facing a similar issue (maybe because of the fix 
or the cause of 3706) in my environment. I have created a ticket 
https://issues.apache.org/jira/browse/ZOOKEEPER-3828 which has the logs and 
behavior similar to it. The SendThread of ZooKeeper client stays in the 
connecting (or rather reconnecting) state until I restart the leader node 
again. 
I feel we need to re-look this.

[~symat] Can this be a reason for my issue (3828)?

> ZooKeeper.close() would leak SendThread when the network is broken
> ------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3706
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3706
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>    Affects Versions: 3.6.0, 3.4.14, 3.5.6
>            Reporter: Pierre Yin
>            Assignee: Pierre Yin
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.6.1
>
>          Time Spent: 10h
>  Remaining Estimate: 0h
>
> The close method of ZooKeeper may cause the leak of SendThread when the 
> network is broken.
> When the network is broken, the SendThread of ZooKeeper client falls into the 
> continuous reconnecting scenario. But there is an unsafe point which is just 
> at the moment before startConnect() during the continuous reconnecting. If 
> SendThread.close() in another thread hit the unsafe point, startConnect() 
> would sleep some time and force to change state to States.CONNECTING although 
> SendThread.close() already set state to States.CLOSED. In this case, the 
> SendThread would be never be dead and nobody would change the state again.
> In normal case, ZooKeeper.close() would be blocked forever to wait 
> closeSession packet is finished until the network broken is recovered. But if 
> user set the request timeout, ZooKeeper.close() would break the block waiting 
> within timeout and invoke SendThread.close() to change state to CLOSED. 
> That's why SendThread.close() can hit the unsafe point.
> Set request timeout is a very common practice. 
> I propose a patch and send it out later.
> Maybe someone can help to review it.
>  
> Thanks
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ZOOKEEPER-3706) ZooKeeper.close() would leak SendThread when the network is broken

Reply via email to