[jira] [Commented] (CONNECTORS-1123) ZK node leak

Karl Wright (JIRA) Wed, 17 Dec 2014 09:07:31 -0800

    [ 
https://issues.apache.org/jira/browse/CONNECTORS-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14250133#comment-14250133
 ]


Karl Wright commented on CONNECTORS-1123:
-----------------------------------------

The large number of unique locks thrown is as expected, because there will be 
one per document hash.  Lock nodes also must be persistent because otherwise 
there's a race condition where one process is trying to tear down the parent 
(persistent) node, and another starts the lock assertion procedure with it.  
The code for locks that is used is pretty much standard Zookeeper best 
practices.

The only alternative to doing it the current way would be to have locks that 
applied (unnecessarily) to multiple documents at a time.  This might be 
achieved if, for example, we were to compute a hash value for the lock string, 
limit the hash value to (say) 65536 values, and then create a lock based on 
that hash value.  Parallelism might be slightly affected, but the number of 
overall locks would be constrained.

Before I go there, though, can you tell me what the *downside* of having one 
lock per document is?  How is this problematic?  What resources are you running 
out of?


> ZK node leak
> ------------
>
>                 Key: CONNECTORS-1123
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1123
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Framework core
>    Affects Versions: ManifoldCF 1.8
>         Environment: 4-node manifold cluster, 3-node ZK ensemble for 
> coordination and configuration management
>            Reporter: Aeham Abushwashi
>
> Looking at the stats of the zookeeper cluster, I was struck by the very high 
> node count reported by the ZK stat command  which shows there being just over 
> 3.84 MILLION nodes. The number keeps rising as long as the manifold nodes are 
> running. Stopping manifold does NOT reduce the number significantly, nor does 
> restarting the ZK ensemble.
> The ZK ensemble was initialised around 20 days ago. Manifold has been running 
> on and off on this cluster since that time.
> The flat nature of the manifold node structure in ZK (at least in the dev_1x 
> branch) makes it difficult to identify node names but after tweaking the 
> jute.maxbuffer parameter on the client, I was able to get a list of all 
> nodes. There's a huge number of nodes with the name pattern 
> org.apache.manifoldcf.locks-<Output Connection>:<Hash>. 
> I could see using this node name pattern used in 
> IncrementalIngester#documentDeleteMultiple and 
> IncrementalIngester#documentRemoveMultiple. However, I'm not expecting any 
> deletions in the tests I've been running recently - perhaps this is part of 
> the duplicate deletion logic which came up in an email thread earlier today? 
> or maybe there's another code path I missed entirely and which creates nodes 
> with names like the above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CONNECTORS-1123) ZK node leak

Reply via email to