[jira] [Commented] (CONNECTORS-1031) Zookeeper hangs eventually with specified parameters

Karl Wright (JIRA) Tue, 16 Sep 2014 12:02:14 -0700

    [ 
https://issues.apache.org/jira/browse/CONNECTORS-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135978#comment-14135978
 ]


Karl Wright commented on CONNECTORS-1031:
-----------------------------------------

The obtainwritelock() method in ZooKeeperConnection does the following:

{code}
        // Assert that we want a write lock
        lockNode = createSequentialChild(lockPath,WRITE_PREFIX);
        try
        {
          String lockSequenceNumber = lockNode.substring(lockPath.length() + 1 
+ WRITE_PREFIX.length());
          // See if we got it
          List<String> children = zookeeper.getChildren(lockPath,false);
{code}

... where createSequentialChild() can fail with a 
KeeperException.ConnectionLossException.  According to this documentation: 
http://wiki.apache.org/hadoop/ZooKeeper/FAQ , this means that the operation 
either succeeded or it didn't.  Unfortunately, the proper cleanup is: 

{code}
zookeeper.delete(lockNode,-1);
{code}

... which requires the lockNode string in order to work!  Not sure how to 
address this.  The only possible saving grace is that the lockNode would be 
ephemeral if it is actually created.  Not clear whether a reconnection flushes 
all client-associated ephemeral nodes, or not.  If it does, then our attempt at 
cleanup is incorrect, and we should just fall through and retry.



> Zookeeper hangs eventually with specified parameters
> ----------------------------------------------------
>
>                 Key: CONNECTORS-1031
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1031
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Framework core
>    Affects Versions: ManifoldCF 1.7
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 2.0
>
>
> The zookeeper parameters we deliver are missing apparently important limits 
> on growth:
> autopurge.snapRetainCount=3 : default value
> autopurge.purgeInterval=1: default value



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CONNECTORS-1031) Zookeeper hangs eventually with specified parameters

Reply via email to