[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15580083#comment-15580083
 ] 

Flavio Junqueira commented on ZOOKEEPER-2615:
---------------------------------------------

Thanks for having a look, [~fournc]. I think you're right about the race 
between closing the connection and setting new watches. When we expire the 
session, that race goes away because the {{closeSession} and {{setWatches}} are 
serialized in the request processor pipeline, but not for closing connections.

There is one related point I'm unsure about. When I was looking at the code, I 
noticed that only NIO close invokes {{ZooKeeperServer.removeCnxn}} -> 
{{ZKDatabase.removeCnxn}} -> {{DataTree.removeCnxn}}, which remove the watches. 
I'm wondering what that's the case.

> Zookeeper server holds onto dead/expired session ids in the watch data 
> structures
> ---------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2615
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2615
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.6
>            Reporter: guoping.gp
>
> The same issue (https://issues.apache.org/jira/browse/ZOOKEEPER-1382) still 
> can be found even with zookeeper 3.4.6.
> this issue cause our production zookeeper cluster leak about 1 million 
> watchs, after restart the server one by one, the watch count decrease to only 
> about 40000.
> I can reproduce the issue on my mac,here it is:
> ------------------------------------------------------------------------
> pguodeMacBook-Air:bin pguo$ echo srvr | nc localhost 6181
> Zookeeper version: 3.4.6-1569965, built on 02/20/2014 09:09 GMT
> Latency min/avg/max: 0/1156/128513
> Received: 539
> Sent: 531
> Connections: 1
> Outstanding: 0
> Zxid: 0x100000037
> Mode: follower
> Node count: 5
> ------------------------
> pguodeMacBook-Air:bin pguo$ echo cons | nc localhost 6181
>  
> /127.0.0.1:55759[1](queued=0,recved=5,sent=5,sid=0x157be2732d0000e,lop=PING,est=1476372631116,to=15000,lcxid=0x1,lzxid=0xffffffffffffffff,lresp=1476372646260,llat=8,minlat=0,avglat=6,maxlat=17)
>  /0:0:0:0:0:0:0:1:55767[0](queued=0,recved=1,sent=0)
> ------------------------
> pguodeMacBook-Air:bin pguo$ echo wchp | nc localhost 6181
> /curator_exists_watch
>               0x357be48e4d90007
>               0x357be48e4d90009
>               0x157be2732d0000e
> as above 4-letter's report shows,     0x357be48e4d90007 and 0x357be48e4d90009 
> are leaked after the two sessions expired 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to