[ 
https://issues.apache.org/jira/browse/CONNECTORS-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14250210#comment-14250210
 ] 

Karl Wright commented on CONNECTORS-1123:
-----------------------------------------

Hi Aeham,

Locks, once used, persist in ZooKeeper until the Zookeeper system is reset (and 
all persistent nodes purged).  But as I said before, the number of locks *is* 
bounded, but it is bounded at a fairly high level.

But given that ZooKeeper and its tools are not expecting to deal with high 
numbers of objects, I'll have to look into making the change I proposed.  There 
are probably other situations in ManifoldCF where the number of locks will 
scale according to the number of items present -- job caching locks, for 
instance, for which there will be one per job.  But I think anything that is 
unlikely to generate (say) more than 100000 locks should still be readily 
handled by ZooKeeper, I would hope?




> ZK node leak
> ------------
>
>                 Key: CONNECTORS-1123
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1123
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Framework core
>    Affects Versions: ManifoldCF 1.8
>         Environment: 4-node manifold cluster, 3-node ZK ensemble for 
> coordination and configuration management
>            Reporter: Aeham Abushwashi
>
> Looking at the stats of the zookeeper cluster, I was struck by the very high 
> node count reported by the ZK stat command  which shows there being just over 
> 3.84 MILLION nodes. The number keeps rising as long as the manifold nodes are 
> running. Stopping manifold does NOT reduce the number significantly, nor does 
> restarting the ZK ensemble.
> The ZK ensemble was initialised around 20 days ago. Manifold has been running 
> on and off on this cluster since that time.
> The flat nature of the manifold node structure in ZK (at least in the dev_1x 
> branch) makes it difficult to identify node names but after tweaking the 
> jute.maxbuffer parameter on the client, I was able to get a list of all 
> nodes. There's a huge number of nodes with the name pattern 
> org.apache.manifoldcf.locks-<Output Connection>:<Hash>. 
> I could see using this node name pattern used in 
> IncrementalIngester#documentDeleteMultiple and 
> IncrementalIngester#documentRemoveMultiple. However, I'm not expecting any 
> deletions in the tests I've been running recently - perhaps this is part of 
> the duplicate deletion logic which came up in an email thread earlier today? 
> or maybe there's another code path I missed entirely and which creates nodes 
> with names like the above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to