[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager

2017-02-01 Thread Jens Rantil (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848218#comment-15848218
 ] 

Jens Rantil commented on ZOOKEEPER-2464:


Hi, we are also seeing this. We have a lot of zNodes building up in production 
as we speak (currently at 3127223). We have a temporary script that can remove 
older znodes, but this is a real big operational risk waiting to explode since 
failover will be very heavy.

What's the next step here? Can we help any way? Is there any date for when a 
release with a fix for this is in place?

> NullPointerException on ContainerManager
> 
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Stefano Salmaso
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a 
> distributed lock using Curator 
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly 
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000, 
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR 
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
>at 
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
>at 
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
>at java.util.TimerThread.mainLoop(Timer.java:555)
>at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our 
> servers



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager

2017-01-26 Thread Mohammad Arshad (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15842312#comment-15842312
 ] 

Mohammad Arshad commented on ZOOKEEPER-2464:


[~randgalt], I created ZOOKEEPER-2680. 
After ZOOKEEPER-2680 fix this issue will get automatically fixed.

> NullPointerException on ContainerManager
> 
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Stefano Salmaso
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a 
> distributed lock using Curator 
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly 
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000, 
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR 
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
>at 
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
>at 
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
>at java.util.TimerThread.mainLoop(Timer.java:555)
>at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our 
> servers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager

2017-01-26 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15841002#comment-15841002
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2464:
-

[~arshad.mohammad] IMO it should be a separate issue. 

> NullPointerException on ContainerManager
> 
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Stefano Salmaso
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a 
> distributed lock using Curator 
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly 
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000, 
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR 
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
>at 
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
>at 
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
>at java.util.TimerThread.mainLoop(Timer.java:555)
>at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our 
> servers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager

2017-01-26 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15841000#comment-15841000
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2464:
-

[~eribeiro] - I think a 1 line change is too much for a test

> NullPointerException on ContainerManager
> 
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Stefano Salmaso
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a 
> distributed lock using Curator 
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly 
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000, 
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR 
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
>at 
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
>at 
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
>at java.util.TimerThread.mainLoop(Timer.java:555)
>at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our 
> servers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager

2017-01-26 Thread Edward Ribeiro (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840464#comment-15840464
 ] 

Edward Ribeiro commented on ZOOKEEPER-2464:
---

Yeah... makes sense, it's pretty inconsistent behavior. It would require some 
defensive code as {{DataNode.setChildren(null)}} could introduce the null 
again. In fact, it is a total  refactoring of {{DataNode}}, albeit a small 
class. Wdyt [~randgalt]? 

> NullPointerException on ContainerManager
> 
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Stefano Salmaso
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a 
> distributed lock using Curator 
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly 
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000, 
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR 
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
>at 
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
>at 
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
>at java.util.TimerThread.mainLoop(Timer.java:555)
>at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our 
> servers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager

2017-01-26 Thread Mohammad Arshad (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840421#comment-15840421
 ] 

Mohammad Arshad commented on ZOOKEEPER-2464:


Root cause of the problem is the inconsistent behavior of 
DataNode.getChildren() API.
DataNode.getChildren() API Current Behaviour:
# returns null initially
When DataNode is created and no children are added yet, DataNode.getChildren() 
returns null
# returns empty set after all the children are deleted:
created a Node
add a child
delete the child
DataNode.getChildren() returns empty set.

I think we should fix this issue by modifying the DataNode.getChildren() API. 
We should always return empty set if there is no child.

> NullPointerException on ContainerManager
> 
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Stefano Salmaso
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a 
> distributed lock using Curator 
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly 
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000, 
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR 
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
>at 
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
>at 
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
>at java.util.TimerThread.mainLoop(Timer.java:555)
>at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our 
> servers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager

2017-01-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836774#comment-15836774
 ] 

Hadoop QA commented on ZOOKEEPER-2464:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12849173/ContainerManagerTest.java
  against trunk revision 762f4af65bb1056a582a6f36183a9e28fe0ccab8.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3568//console

This message is automatically generated.

> NullPointerException on ContainerManager
> 
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Stefano Salmaso
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ContainerManagerTest.java, ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a 
> distributed lock using Curator 
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly 
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000, 
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR 
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
>at 
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
>at 
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
>at java.util.TimerThread.mainLoop(Timer.java:555)
>at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our 
> servers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager

2016-07-05 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362136#comment-15362136
 ] 

Flavio Junqueira commented on ZOOKEEPER-2464:
-

+1, lgtm. Thanks for the patch [~randgalt].

> NullPointerException on ContainerManager
> 
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Stefano Salmaso
>Assignee: Jordan Zimmerman
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a 
> distributed lock using Curator 
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly 
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000, 
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR 
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
>at 
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
>at 
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
>at java.util.TimerThread.mainLoop(Timer.java:555)
>at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our 
> servers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager

2016-07-04 Thread Jordan Zimmerman (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361836#comment-15361836
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2464:
-

1 line change and no test case was provided - so I didn't add a test

> NullPointerException on ContainerManager
> 
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Stefano Salmaso
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a 
> distributed lock using Curator 
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly 
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000, 
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR 
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
>at 
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
>at 
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
>at java.util.TimerThread.mainLoop(Timer.java:555)
>at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our 
> servers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2464) NullPointerException on ContainerManager

2016-07-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361833#comment-15361833
 ] 

Hadoop QA commented on ZOOKEEPER-2464:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12816094/ZOOKEEPER-2464.patch
  against trunk revision 1750739.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 2.0.3) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3261//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3261//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3261//console

This message is automatically generated.

> NullPointerException on ContainerManager
> 
>
> Key: ZOOKEEPER-2464
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2464
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Stefano Salmaso
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-2464.patch
>
>
> I would like to expose you to a problem that we are experiencing.
> We are using a cluster of 7 zookeeper and we use them to implement a 
> distributed lock using Curator 
> (http://curator.apache.org/curator-recipes/shared-reentrant-lock.html)
> So .. we tried to play with the servers to see if everything worked properly 
> and we stopped and start servers to see that the system worked well
> (like stop 03, stop 05, stop 06, start 05, start 06, start 03)
> We saw a strange behavior.
> The number of znodes grew up without stopping (normally we had 4000 or 5000, 
> we got to 60,000 and then we stopped our application)
> In zookeeeper logs I saw this (on leader only, one every minute)
> 2016-07-04 14:53:50,302 [myid:7] - ERROR 
> [ContainerManagerTask:ContainerManager$1@84] - Error checking containers
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.ContainerManager.getCandidates(ContainerManager.java:151)
>at 
> org.apache.zookeeper.server.ContainerManager.checkContainers(ContainerManager.java:111)
>at 
> org.apache.zookeeper.server.ContainerManager$1.run(ContainerManager.java:78)
>at java.util.TimerThread.mainLoop(Timer.java:555)
>at java.util.TimerThread.run(Timer.java:505)
> We have not yet deleted the data ... so the problem can be reproduced on our 
> servers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)