[jira] Updated: (ZOOKEEPER-543) Tests for ZooKeeper examples

2010-03-05 Thread Steven Cheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Cheng updated ZOOKEEPER-543:
---

Attachment: ZOOKEEPER-543.patch

> Tests for ZooKeeper examples
> 
>
> Key: ZOOKEEPER-543
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-543
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: tests
>Affects Versions: 3.3.0
>Reporter: Steven Cheng
>Assignee: Steven Cheng
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-543.patch, ZOOKEEPER-543.patch, 
> ZOOKEEPER-543.patch, ZOOKEEPER-543.patch
>
>
> Initial attempt to create ZooKeeper tests based on the example code on the 
> website.  
> Current plan is to test features used in examples using ZooKeeper calls 
> directly.  Another approach would be to make more usable abstractions such as 
> those in src/recipes and test those.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-543) Tests for ZooKeeper examples

2010-03-05 Thread Steven Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842212#action_12842212
 ] 

Steven Cheng commented on ZOOKEEPER-543:


Extra import was left in there for some reason.

> Tests for ZooKeeper examples
> 
>
> Key: ZOOKEEPER-543
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-543
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: tests
>Affects Versions: 3.3.0
>Reporter: Steven Cheng
>Assignee: Steven Cheng
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-543.patch, ZOOKEEPER-543.patch, 
> ZOOKEEPER-543.patch, ZOOKEEPER-543.patch
>
>
> Initial attempt to create ZooKeeper tests based on the example code on the 
> website.  
> Current plan is to test features used in examples using ZooKeeper calls 
> directly.  Another approach would be to make more usable abstractions such as 
> those in src/recipes and test those.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-677) c client doesn't allow ipv6 numeric connect string

2010-03-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842194#action_12842194
 ] 

Hadoop QA commented on ZOOKEEPER-677:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12438065/ZOOKEEPER-677.patch
  against trunk revision 919640.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/134/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/134/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/134/console

This message is automatically generated.

> c client doesn't allow ipv6 numeric connect string
> --
>
> Key: ZOOKEEPER-677
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-677
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.2.2
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-677.patch, ZOOKEEPER-677.patch
>
>
> The c client doesn't handle ipv6 numeric addresses as they are colon : 
> delmited. After splitting the host/port on : we look for the port as the 
> second entry in the array rather than the last entry in the array.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-622) Test for pending watches in send_set_watches should be moved

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-622:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I just committed this. thanks ben and steven.

> Test for pending watches in send_set_watches should be moved
> 
>
> Key: ZOOKEEPER-622
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-622
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Steven Cheng
>Assignee: Benjamin Reed
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-622.patch, ZOOKEEPER-622.patch, 
> ZOOKEEPER-622.patch, ZOOKEEPER-622.patch
>
>
> Valgrind found:
> {quote}
> ==2357== Conditional jump or move depends on uninitialised value(s)
> ==2357==at 0x807FDCA: check_events (zookeeper.c:1180)
> ==2357==by 0x808043A: zookeeper_process (zookeeper.c:1775)
> ==2357==by 0x806A21B: Zookeeper_close::testCloseConnected1() 
> (TestZookeeperClose.cc:161)
> ==2357==by 0x806C6BF: CppUnit::TestCaller::runTest() 
> (TestCaller.h:166)
> {quote}
> zookeeper.c:1180 was the first if in send_set_watches.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-543) Tests for ZooKeeper examples

2010-03-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842190#action_12842190
 ] 

Hadoop QA commented on ZOOKEEPER-543:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12431894/ZOOKEEPER-543.patch
  against trunk revision 919640.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 8 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/133/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/133/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/133/console

This message is automatically generated.

> Tests for ZooKeeper examples
> 
>
> Key: ZOOKEEPER-543
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-543
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: tests
>Affects Versions: 3.3.0
>Reporter: Steven Cheng
>Assignee: Steven Cheng
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-543.patch, ZOOKEEPER-543.patch, 
> ZOOKEEPER-543.patch
>
>
> Initial attempt to create ZooKeeper tests based on the example code on the 
> website.  
> Current plan is to test features used in examples using ZooKeeper calls 
> directly.  Another approach would be to make more usable abstractions such as 
> those in src/recipes and test those.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-622) Test for pending watches in send_set_watches should be moved

2010-03-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842187#action_12842187
 ] 

Hadoop QA commented on ZOOKEEPER-622:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12437783/ZOOKEEPER-622.patch
  against trunk revision 919640.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/132/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/132/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/132/console

This message is automatically generated.

> Test for pending watches in send_set_watches should be moved
> 
>
> Key: ZOOKEEPER-622
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-622
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Steven Cheng
>Assignee: Benjamin Reed
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-622.patch, ZOOKEEPER-622.patch, 
> ZOOKEEPER-622.patch, ZOOKEEPER-622.patch
>
>
> Valgrind found:
> {quote}
> ==2357== Conditional jump or move depends on uninitialised value(s)
> ==2357==at 0x807FDCA: check_events (zookeeper.c:1180)
> ==2357==by 0x808043A: zookeeper_process (zookeeper.c:1775)
> ==2357==by 0x806A21B: Zookeeper_close::testCloseConnected1() 
> (TestZookeeperClose.cc:161)
> ==2357==by 0x806C6BF: CppUnit::TestCaller::runTest() 
> (TestCaller.h:166)
> {quote}
> zookeeper.c:1180 was the first if in send_set_watches.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-59) Synchronized block in NIOServerCnxn

2010-03-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842186#action_12842186
 ] 

Hadoop QA commented on ZOOKEEPER-59:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12437656/ZOOKEEPER-59.patch
  against trunk revision 919640.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/131/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/131/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/131/console

This message is automatically generated.

> Synchronized block in NIOServerCnxn
> ---
>
> Key: ZOOKEEPER-59
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-59
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Flavio Paiva Junqueira
>Assignee: Flavio Paiva Junqueira
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-59.patch, ZOOKEEPER-59.patch
>
>
> There are two synchronized blocks locking on different objects, and to me 
> they should be guarded by the same object. Here are the parts of the code I'm 
> talking about:
> {noformat}
> nioservercnxn.readrequ...@444
> ...
>   synchronized (this) {
> outstandingRequests++;
> // check throttling
> if (zk.getInProcess() > factory.outstandingLimit) {
> disableRecv();
> // following lines should not be needed since we are 
> already
> // reading
> // } else {
> // enableRecv();
> }
> } 
> {noformat}
> {noformat}
> nioservercnxn.sendrespo...@740
> ...
>  synchronized (this.factory) {
> outstandingRequests--;
> // check throttling
> if (zk.getInProcess() < factory.outstandingLimit
> || outstandingRequests < 1) {
> sk.selector().wakeup();
> enableRecv();
> }
> }
> {noformat}
> I think the second one is correct, and the first synchronized block should be 
> guarded by "this.factory". 
> This could be related to issue ZOOKEEPER-57, but I have no concrete 
> indication that this is the case so far.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-511) bad error handling in FollowerHandler.sendPackets

2010-03-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842181#action_12842181
 ] 

Hadoop QA commented on ZOOKEEPER-511:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12438061/ZOOKEEPER-511.patch
  against trunk revision 919640.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/130/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/130/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/130/console

This message is automatically generated.

> bad error handling in FollowerHandler.sendPackets
> -
>
> Key: ZOOKEEPER-511
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-511
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.2.0
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-511.patch
>
>
> in FollowerHandler if sendPackets gets an ioexception on writeRecord the send 
> thread will exit, however the 
> socket isn't necessarily closed.
> 2009-08-19 15:28:46,869 - WARN  [Sender-/127.0.0.1:58179:followerhand...@131] 
> - Unexpected exception
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler.sendPackets(FollowerHandler.java:128)
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler.access$0(FollowerHandler.java:107)
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler$1.run(FollowerHandler.java:325)
> This results in the follower taking a very long time to recover and rejoin 
> the quorum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-663) hudson failure in ZKDatabaseCorruptionTest

2010-03-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842160#action_12842160
 ] 

Hadoop QA commented on ZOOKEEPER-663:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12438038/ZOOKEEPER-663.patch
  against trunk revision 919640.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/129/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/129/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/129/console

This message is automatically generated.

> hudson failure in ZKDatabaseCorruptionTest
> --
>
> Key: ZOOKEEPER-663
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-663
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-663.patch
>
>
> http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/686/
> java.lang.RuntimeException: Unable to run quorum server 
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:380)
>   at 
> org.apache.zookeeper.test.ZkDatabaseCorruptionTest.testCorruption(ZkDatabaseCorruptionTest.java:99)
> Caused by: java.io.IOException: Invalid magic number 0 != 1514884167
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:455)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:471)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:438)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:519)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:145)
>   at 
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:193)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:377)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-677) c client doesn't allow ipv6 numeric connect string

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-677:


Attachment: ZOOKEEPER-677.patch

this patch should fix the issue.


> c client doesn't allow ipv6 numeric connect string
> --
>
> Key: ZOOKEEPER-677
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-677
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.2.2
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-677.patch, ZOOKEEPER-677.patch
>
>
> The c client doesn't handle ipv6 numeric addresses as they are colon : 
> delmited. After splitting the host/port on : we look for the port as the 
> second entry in the array rather than the last entry in the array.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-677) c client doesn't allow ipv6 numeric connect string

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-677:


Status: Patch Available  (was: Open)

> c client doesn't allow ipv6 numeric connect string
> --
>
> Key: ZOOKEEPER-677
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-677
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.2.2
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-677.patch, ZOOKEEPER-677.patch
>
>
> The c client doesn't handle ipv6 numeric addresses as they are colon : 
> delmited. After splitting the host/port on : we look for the port as the 
> second entry in the array rather than the last entry in the array.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-543) Tests for ZooKeeper examples

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-543:
---

Status: Patch Available  (was: Open)

> Tests for ZooKeeper examples
> 
>
> Key: ZOOKEEPER-543
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-543
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: tests
>Affects Versions: 3.3.0
>Reporter: Steven Cheng
>Assignee: Steven Cheng
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-543.patch, ZOOKEEPER-543.patch, 
> ZOOKEEPER-543.patch
>
>
> Initial attempt to create ZooKeeper tests based on the example code on the 
> website.  
> Current plan is to test features used in examples using ZooKeeper calls 
> directly.  Another approach would be to make more usable abstractions such as 
> those in src/recipes and test those.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-543) Tests for ZooKeeper examples

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-543:
---

Status: Open  (was: Patch Available)

> Tests for ZooKeeper examples
> 
>
> Key: ZOOKEEPER-543
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-543
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: tests
>Affects Versions: 3.3.0
>Reporter: Steven Cheng
>Assignee: Steven Cheng
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-543.patch, ZOOKEEPER-543.patch, 
> ZOOKEEPER-543.patch
>
>
> Initial attempt to create ZooKeeper tests based on the example code on the 
> website.  
> Current plan is to test features used in examples using ZooKeeper calls 
> directly.  Another approach would be to make more usable abstractions such as 
> those in src/recipes and test those.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-458) connect_index in zookeeper handle might get out of bound.

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-458:


Fix Version/s: (was: 3.3.0)
   3.4.0

this is quite critical but we wont be able to fix this before 3.3 deadline. 
Moving it to 3.4.

> connect_index in zookeeper handle might get out of bound.
> -
>
> Key: ZOOKEEPER-458
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-458
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Mahadev konar
>Assignee: Steven Cheng
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-458.patch, ZOOKEEPER-458.patch, 
> ZOOKEEPER-458.patch, ZOOKEEPER-458.patch, ZOOKEEPER-458.patch, 
> ZOOKEEPER-458.patch, ZOOKEEPER-458.patch
>
>
> connect_index in zookeeper handle might get out of bound. the zokoeeper_init 
> method checks for index == count and sets it to zero. If the index becomes 
> greater than count, then it will go out of bounds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-624) The C Client cause core dump when receive error data from Zookeeper Server

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar reassigned ZOOKEEPER-624:
---

Assignee: Mahadev konar  (was: Benjamin Reed)

> The C Client cause core dump when receive error data from Zookeeper Server
> --
>
> Key: ZOOKEEPER-624
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-624
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.2.0
> Environment: Linux 2.6.9 x86_64
>Reporter: Qian Ye
>Assignee: Mahadev konar
> Fix For: 3.3.0
>
>
> I encountered a problem today that the Zookeeper C Client (version 3.2.0) 
> core dump when reconnected and did some operations on the zookeeper server 
> which just restarted. The gdb infomation is like:
> (gdb) bt
> #0  0x00302af71900 in memcpy () from /lib64/tls/libc.so.6
> #1  0x0047bfe4 in ia_deserialize_string (ia=Variable "ia" is not 
> available.) at src/recordio.c:270
> #2  0x0047ed20 in deserialize_CreateResponse (in=0x9cd870, 
> tag=0x50a74e "reply", v=0x409ffe70) at generated/zookeeper.jute.c:679
> #3  0x0047a1d0 in zookeeper_process (zh=0x9c8c70, events=Variable 
> "events" is not available.) at src/zookeeper.c:1895
> #4  0x004815e6 in do_io (v=Variable "v" is not available.) at 
> src/mt_adaptor.c:310
> #5  0x00302b80610a in start_thread () from /lib64/tls/libpthread.so.0
> #6  0x00302afc6003 in clone () from /lib64/tls/libc.so.6
> #7  0x in ?? ()
> (gdb) f 1
> #1  0x0047bfe4 in ia_deserialize_string (ia=Variable "ia" is not 
> available.) at src/recordio.c:270
> 270 in src/recordio.c
> (gdb) info locals
> priv = (struct buff_struct *) 0x9cd8d0
> len = -1
> rc = Variable "rc" is not available.
> According to the source code,
> int ia_deserialize_string(struct iarchive *ia, const char *name, char **s)
> {
> struct buff_struct *priv = ia->priv;
> int32_t len;
> int rc = ia_deserialize_int(ia, "len", &len);
> if (rc < 0)
> return rc;
> if ((priv->len - priv->off) < len) {
> return -E2BIG;
> }
> *s = malloc(len+1);
> if (!*s) {
> return -ENOMEM;
> }
> memcpy(*s, priv->buffer+priv->off, len);
> (*s)[len] = '\0';
> priv->off += len;
> return 0;
> }
> the variable len is set by ia_deserialize_int, and the returned len doesn't 
> been checked, so the client segment fault when trying to memcpy -1 byte data.
> In the source file recordio.c, there are many functions which don't check the 
> returned len. They all might cause segment fault in some kind of  situations.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-645) Bug in WriteLock recipe implementation?

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar reassigned ZOOKEEPER-645:
---

Assignee: Mahadev konar  (was: Jaakko Laine)

> Bug in WriteLock recipe implementation?
> ---
>
> Key: ZOOKEEPER-645
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-645
> Project: Zookeeper
>  Issue Type: Bug
>  Components: recipes
>Affects Versions: 3.2.2
> Environment: 3.2.2 java 1.6.0_12
>Reporter: Jaakko Laine
>Assignee: Mahadev konar
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: 645-fix-findPrefixInChildren.patch
>
>
> Not sure, but there seem to be two issues in the example WriteLock:
> (1) ZNodeName is sorted according to session ID first, and then according to 
> znode sequence number. This might cause starvation as lower session IDs 
> always get priority. WriteLock is not thread-safe in the first place, so 
> having session ID involved in compare operation does not seem to make sense.
> (2) if findPrefixInChildren finds previous ID, it should add dir in front of 
> the ID

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-591) The C Client cannot exit properly in some situation

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar reassigned ZOOKEEPER-591:
---

Assignee: Mahadev konar  (was: Benjamin Reed)

> The C Client cannot exit properly in some situation
> ---
>
> Key: ZOOKEEPER-591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-591
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.2.1
> Environment: Linux db-passport-test05.vm 2.6.9_5-4-0-5 #1 SMP Tue Apr 
> 14 15:56:24 CST 2009 x86_64 x86_64 x86_64 GNU/Linux 
>Reporter: Qian Ye
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.3.0
>
>
> The following code produce a situation, where the C Client can not exit 
> properly,
> #include "include/zookeeper.h"
> void default_zoo_watcher(zhandle_t *zzh, int type, int state, const char 
> *path, void* context){
> int zrc = 0;
> struct String_vector str_vec = {0, NULL};
> printf("in the default_zoo_watcher\n");
> zrc = zoo_wget_children(zzh, "/mytest", default_zoo_watcher, NULL, 
> &str_vec);
> printf("zoo_wget_children, error: %d\n", zrc);
> return;
> }
> int main()
> {
> int zrc = 0;
> int buff_len = 10; 
> char buff[10] = "hello";
> char path[512];
> struct Stat stat;
> struct String_vector str_vec = {0, NULL};
> zhandle_t *zh = zookeeper_init("10.81.20.62:2181", NULL, 3, 0, 0, 0); 
> zrc = zoo_create(zh, "/mytest", buff, 10, &ZOO_OPEN_ACL_UNSAFE, 0, path, 
> 512);
> printf("zoo_create, error: %d\n", zrc);
> zrc = zoo_wget_children(zh, "/mytest", default_zoo_watcher, NULL, 
> &str_vec);
> printf("zoo_wget_children, error: %d\n", zrc);
> zrc = zoo_create(zh, "/mytest/test1", buff, 10, &ZOO_OPEN_ACL_UNSAFE, 0, 
> path, 512);
> printf("zoo_create, error: %d\n", zrc);
> zrc = zoo_wget_children(zh, "/mytest", default_zoo_watcher, NULL, 
> &str_vec);
> printf("zoo_wget_children, error: %d\n", zrc);
> zrc = zoo_delete(zh, "/mytest/test1", -1);
> printf("zoo_delete, error: %d\n", zrc);
> zookeeper_close(zh);
> return 0;
> }
> running this code can cause the program hang at zookeeper_close(zh);(line 
> 38). using gdb to attach the process, I found that the main thread is waiting 
> for do_completion thread to finish,
> (gdb) bt
> #0  0x00302b806ffb in pthread_join () from /lib64/tls/libpthread.so.0
> #1  0x0040de3b in adaptor_finish (zh=0x515b60) at src/mt_adaptor.c:219
> #2  0x004060ba in zookeeper_close (zh=0x515b60) at 
> src/zookeeper.c:2100
> #3  0x0040220b in main ()
> and the thread which handle the zoo_wget_children(in the default_zoo_watcher) 
> is waiting for sc->cond. 
> (gdb) thread 2
> [Switching to thread 2 (Thread 1094719840 (LWP 25093))]#0  0x00302b8089aa 
> in pthread_cond_wait@@GLIBC_2.3.2 ()
>from /lib64/tls/libpthread.so.0
> (gdb) bt
> #0  0x00302b8089aa in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/tls/libpthread.so.0
> #1  0x0040d88b in wait_sync_completion (sc=0x5167f0) at 
> src/mt_adaptor.c:82
> #2  0x004082c9 in zoo_wget_children (zh=0x515b60, path=0x40ebc0 
> "/mytest", watcher=0x401fd8 , watcherCtx=Variable 
> "watcherCtx" is not available.)
> at src/zookeeper.c:2884
> #3  0x00402037 in default_zoo_watcher ()
> #4  0x0040d664 in deliverWatchers (zh=0x515b60, type=4, state=3, 
> path=0x515100 "/mytest", list=0x5177d8) at src/zk_hashtable.c:274
> #5  0x00403861 in process_completions (zh=0x515b60) at 
> src/zookeeper.c:1631
> #6  0x0040e1b5 in do_completion (v=Variable "v" is not available.) at 
> src/mt_adaptor.c:333
> #7  0x00302b80610a in start_thread () from /lib64/tls/libpthread.so.0
> #8  0x00302afc6003 in clone () from /lib64/tls/libc.so.6
> #9  0x in ?? ()
> here, a deadlock presents.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-685) make C system tests use multiple servers

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-685:


Fix Version/s: 3.3.0

> make C system tests use multiple servers
> 
>
> Key: ZOOKEEPER-685
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-685
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: c client
>Reporter: Benjamin Reed
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-685.patch
>
>
> the C system tests run against a single server. we really need to use a 
> multiple server configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-622) Test for pending watches in send_set_watches should be moved

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-622:
---

Status: Patch Available  (was: Open)

> Test for pending watches in send_set_watches should be moved
> 
>
> Key: ZOOKEEPER-622
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-622
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Steven Cheng
>Assignee: Benjamin Reed
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-622.patch, ZOOKEEPER-622.patch, 
> ZOOKEEPER-622.patch, ZOOKEEPER-622.patch
>
>
> Valgrind found:
> {quote}
> ==2357== Conditional jump or move depends on uninitialised value(s)
> ==2357==at 0x807FDCA: check_events (zookeeper.c:1180)
> ==2357==by 0x808043A: zookeeper_process (zookeeper.c:1775)
> ==2357==by 0x806A21B: Zookeeper_close::testCloseConnected1() 
> (TestZookeeperClose.cc:161)
> ==2357==by 0x806C6BF: CppUnit::TestCaller::runTest() 
> (TestCaller.h:166)
> {quote}
> zookeeper.c:1180 was the first if in send_set_watches.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-622) Test for pending watches in send_set_watches should be moved

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-622:
---

Status: Open  (was: Patch Available)

> Test for pending watches in send_set_watches should be moved
> 
>
> Key: ZOOKEEPER-622
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-622
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Steven Cheng
>Assignee: Benjamin Reed
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-622.patch, ZOOKEEPER-622.patch, 
> ZOOKEEPER-622.patch, ZOOKEEPER-622.patch
>
>
> Valgrind found:
> {quote}
> ==2357== Conditional jump or move depends on uninitialised value(s)
> ==2357==at 0x807FDCA: check_events (zookeeper.c:1180)
> ==2357==by 0x808043A: zookeeper_process (zookeeper.c:1775)
> ==2357==by 0x806A21B: Zookeeper_close::testCloseConnected1() 
> (TestZookeeperClose.cc:161)
> ==2357==by 0x806C6BF: CppUnit::TestCaller::runTest() 
> (TestCaller.h:166)
> {quote}
> zookeeper.c:1180 was the first if in send_set_watches.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-59) Synchronized block in NIOServerCnxn

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-59?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-59:
--

Status: Open  (was: Patch Available)

> Synchronized block in NIOServerCnxn
> ---
>
> Key: ZOOKEEPER-59
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-59
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Flavio Paiva Junqueira
>Assignee: Flavio Paiva Junqueira
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-59.patch, ZOOKEEPER-59.patch
>
>
> There are two synchronized blocks locking on different objects, and to me 
> they should be guarded by the same object. Here are the parts of the code I'm 
> talking about:
> {noformat}
> nioservercnxn.readrequ...@444
> ...
>   synchronized (this) {
> outstandingRequests++;
> // check throttling
> if (zk.getInProcess() > factory.outstandingLimit) {
> disableRecv();
> // following lines should not be needed since we are 
> already
> // reading
> // } else {
> // enableRecv();
> }
> } 
> {noformat}
> {noformat}
> nioservercnxn.sendrespo...@740
> ...
>  synchronized (this.factory) {
> outstandingRequests--;
> // check throttling
> if (zk.getInProcess() < factory.outstandingLimit
> || outstandingRequests < 1) {
> sk.selector().wakeup();
> enableRecv();
> }
> }
> {noformat}
> I think the second one is correct, and the first synchronized block should be 
> guarded by "this.factory". 
> This could be related to issue ZOOKEEPER-57, but I have no concrete 
> indication that this is the case so far.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-59) Synchronized block in NIOServerCnxn

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-59?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-59:
--

Status: Patch Available  (was: Open)

> Synchronized block in NIOServerCnxn
> ---
>
> Key: ZOOKEEPER-59
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-59
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Flavio Paiva Junqueira
>Assignee: Flavio Paiva Junqueira
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-59.patch, ZOOKEEPER-59.patch
>
>
> There are two synchronized blocks locking on different objects, and to me 
> they should be guarded by the same object. Here are the parts of the code I'm 
> talking about:
> {noformat}
> nioservercnxn.readrequ...@444
> ...
>   synchronized (this) {
> outstandingRequests++;
> // check throttling
> if (zk.getInProcess() > factory.outstandingLimit) {
> disableRecv();
> // following lines should not be needed since we are 
> already
> // reading
> // } else {
> // enableRecv();
> }
> } 
> {noformat}
> {noformat}
> nioservercnxn.sendrespo...@740
> ...
>  synchronized (this.factory) {
> outstandingRequests--;
> // check throttling
> if (zk.getInProcess() < factory.outstandingLimit
> || outstandingRequests < 1) {
> sk.selector().wakeup();
> enableRecv();
> }
> }
> {noformat}
> I think the second one is correct, and the first synchronized block should be 
> guarded by "this.factory". 
> This could be related to issue ZOOKEEPER-57, but I have no concrete 
> indication that this is the case so far.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-511) bad error handling in FollowerHandler.sendPackets

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-511:


Status: Patch Available  (was: Open)

> bad error handling in FollowerHandler.sendPackets
> -
>
> Key: ZOOKEEPER-511
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-511
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.2.0
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-511.patch
>
>
> in FollowerHandler if sendPackets gets an ioexception on writeRecord the send 
> thread will exit, however the 
> socket isn't necessarily closed.
> 2009-08-19 15:28:46,869 - WARN  [Sender-/127.0.0.1:58179:followerhand...@131] 
> - Unexpected exception
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler.sendPackets(FollowerHandler.java:128)
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler.access$0(FollowerHandler.java:107)
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler$1.run(FollowerHandler.java:325)
> This results in the follower taking a very long time to recover and rejoin 
> the quorum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-543) Tests for ZooKeeper examples

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-543:
---

Status: Patch Available  (was: Open)

> Tests for ZooKeeper examples
> 
>
> Key: ZOOKEEPER-543
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-543
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: tests
>Affects Versions: 3.3.0
>Reporter: Steven Cheng
>Assignee: Steven Cheng
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-543.patch, ZOOKEEPER-543.patch, 
> ZOOKEEPER-543.patch
>
>
> Initial attempt to create ZooKeeper tests based on the example code on the 
> website.  
> Current plan is to test features used in examples using ZooKeeper calls 
> directly.  Another approach would be to make more usable abstractions such as 
> those in src/recipes and test those.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-511) bad error handling in FollowerHandler.sendPackets

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar reassigned ZOOKEEPER-511:
---

Assignee: Mahadev konar

> bad error handling in FollowerHandler.sendPackets
> -
>
> Key: ZOOKEEPER-511
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-511
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.2.0
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-511.patch
>
>
> in FollowerHandler if sendPackets gets an ioexception on writeRecord the send 
> thread will exit, however the 
> socket isn't necessarily closed.
> 2009-08-19 15:28:46,869 - WARN  [Sender-/127.0.0.1:58179:followerhand...@131] 
> - Unexpected exception
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler.sendPackets(FollowerHandler.java:128)
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler.access$0(FollowerHandler.java:107)
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler$1.run(FollowerHandler.java:325)
> This results in the follower taking a very long time to recover and rejoin 
> the quorum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-511) bad error handling in FollowerHandler.sendPackets

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-511:


Fix Version/s: (was: 3.4.0)
   3.3.0

moving it to 3.3.

> bad error handling in FollowerHandler.sendPackets
> -
>
> Key: ZOOKEEPER-511
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-511
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.2.0
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-511.patch
>
>
> in FollowerHandler if sendPackets gets an ioexception on writeRecord the send 
> thread will exit, however the 
> socket isn't necessarily closed.
> 2009-08-19 15:28:46,869 - WARN  [Sender-/127.0.0.1:58179:followerhand...@131] 
> - Unexpected exception
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler.sendPackets(FollowerHandler.java:128)
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler.access$0(FollowerHandler.java:107)
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler$1.run(FollowerHandler.java:325)
> This results in the follower taking a very long time to recover and rejoin 
> the quorum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-543) Tests for ZooKeeper examples

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-543:
---

Status: Open  (was: Patch Available)

> Tests for ZooKeeper examples
> 
>
> Key: ZOOKEEPER-543
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-543
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: tests
>Affects Versions: 3.3.0
>Reporter: Steven Cheng
>Assignee: Steven Cheng
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-543.patch, ZOOKEEPER-543.patch, 
> ZOOKEEPER-543.patch
>
>
> Initial attempt to create ZooKeeper tests based on the example code on the 
> website.  
> Current plan is to test features used in examples using ZooKeeper calls 
> directly.  Another approach would be to make more usable abstractions such as 
> those in src/recipes and test those.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-458) connect_index in zookeeper handle might get out of bound.

2010-03-05 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842094#action_12842094
 ] 

Patrick Hunt commented on ZOOKEEPER-458:


What do we want to do here guys, push to 3.4.0? I'd really like to see this one 
get closed out, esp after all the work you've done.

> connect_index in zookeeper handle might get out of bound.
> -
>
> Key: ZOOKEEPER-458
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-458
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Mahadev konar
>Assignee: Steven Cheng
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-458.patch, ZOOKEEPER-458.patch, 
> ZOOKEEPER-458.patch, ZOOKEEPER-458.patch, ZOOKEEPER-458.patch, 
> ZOOKEEPER-458.patch, ZOOKEEPER-458.patch
>
>
> connect_index in zookeeper handle might get out of bound. the zokoeeper_init 
> method checks for index == count and sets it to zero. If the index becomes 
> greater than count, then it will go out of bounds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-511) bad error handling in FollowerHandler.sendPackets

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-511:


Attachment: ZOOKEEPER-511.patch

this patch fixes the issue by closing the socket on a write error on the 
sendthread which will cause immediate notification of tcp socket error on the 
learner/observer side.

> bad error handling in FollowerHandler.sendPackets
> -
>
> Key: ZOOKEEPER-511
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-511
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.2.0
>Reporter: Patrick Hunt
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-511.patch
>
>
> in FollowerHandler if sendPackets gets an ioexception on writeRecord the send 
> thread will exit, however the 
> socket isn't necessarily closed.
> 2009-08-19 15:28:46,869 - WARN  [Sender-/127.0.0.1:58179:followerhand...@131] 
> - Unexpected exception
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler.sendPackets(FollowerHandler.java:128)
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler.access$0(FollowerHandler.java:107)
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler$1.run(FollowerHandler.java:325)
> This results in the follower taking a very long time to recover and rejoin 
> the quorum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-290) Bad Address on Bookie

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-290:
---

Fix Version/s: (was: 3.3.0)
   3.4.0

> Bad Address on Bookie
> -
>
> Key: ZOOKEEPER-290
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-290
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-bookkeeper
>Reporter: Flavio Paiva Junqueira
> Fix For: 3.4.0
>
>
> I'm getting this exception sometimes when running a bookie under a high 
> volume of requests:
> {noformat}
> java.io.IOException: Bad address
> at sun.nio.ch.FileDispatcher.writev0(Native Method)
> at sun.nio.ch.FileDispatcher.writev(FileDispatcher.java:51)
> at sun.nio.ch.IOUtil.write(IOUtil.java:164)
> at sun.nio.ch.FileChannelImpl.write0(FileChannelImpl.java:232)
> at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:249)
> at java.nio.channels.FileChannel.write(FileChannel.java:222)
> at org.apache.bookkeeper.bookie.Bookie.run(Bookie.java:246)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-445) Potential bug in leader code

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-445:
---

Fix Version/s: (was: 3.3.0)
   3.4.0

> Potential bug in leader code
> 
>
> Key: ZOOKEEPER-445
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-445
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
> Environment: Linux fortiz-desktop 2.6.27-7-generic #1 SMP Fri Oct 24 
> 06:42:44 UTC 2008 i686 GNU/Linux
> java version "1.6.0_10"
> Java(TM) SE Runtime Environment (build 1.6.0_10-b33)
> Java HotSpot(TM) Client VM (build 11.0-b15, mixed mode, sharing)
>Reporter: Manos Kapritsos
>Priority: Minor
> Fix For: 3.4.0
>
>   Original Estimate: 0.33h
>  Remaining Estimate: 0.33h
>
> There is a suspicious line in server/quorum/Leader.java:226. It reads
> if (stop) {
> LOG.info("exception while shutting down acceptor: " + e);
> stop = true;
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-511) bad error handling in FollowerHandler.sendPackets

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-511:
---

Fix Version/s: (was: 3.3.0)
   3.4.0

> bad error handling in FollowerHandler.sendPackets
> -
>
> Key: ZOOKEEPER-511
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-511
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.2.0
>Reporter: Patrick Hunt
> Fix For: 3.4.0
>
>
> in FollowerHandler if sendPackets gets an ioexception on writeRecord the send 
> thread will exit, however the 
> socket isn't necessarily closed.
> 2009-08-19 15:28:46,869 - WARN  [Sender-/127.0.0.1:58179:followerhand...@131] 
> - Unexpected exception
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler.sendPackets(FollowerHandler.java:128)
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler.access$0(FollowerHandler.java:107)
>   at 
> org.apache.zookeeper.server.quorum.FollowerHandler$1.run(FollowerHandler.java:325)
> This results in the follower taking a very long time to recover and rejoin 
> the quorum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-312) AUTH_FAILED state is unused

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-312:
---

Fix Version/s: (was: 3.3.0)
   3.4.0

> AUTH_FAILED state is unused
> ---
>
> Key: ZOOKEEPER-312
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-312
> Project: Zookeeper
>  Issue Type: Improvement
>Reporter: Tom White
> Fix For: 3.4.0
>
>
> Either the AUTH_FAILED state should be removed, or an AuthFailedException 
> should cause the ZooKeeper client to transition to the AUTH_FAILED state.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-641) Improve details about group membership recipe

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-641:
---

Fix Version/s: (was: 3.3.0)
   3.4.0

> Improve details about group membership recipe
> -
>
> Key: ZOOKEEPER-641
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-641
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.2.1
>Reporter: Adam Rosien
> Fix For: 3.4.0
>
>
> Regarding 
> http://eng.kaching.com/2010/01/actually-implementing-group-management.html 
> Patrick Hunt asked for a more complete group membership recipe from the one 
> listed at 
> http://hadoop.apache.org/zookeeper/docs/r3.0.0/recipes.html#sc_outOfTheBox.
> The relevant text from the blog post:
> One type of group management system using ZooKeeper:
> * A group contains some logical service. The *meaning* of belonging to a 
> group is typically "the instance is available for use by clients over the 
> network".
> * Services can join and leave the group. The special case of a service 
> crashing or a network outage needs to be handled as leaving the group.
> * Joined services share metadata about how to communicate with it, i.e., 
> its IP address, base URL, etc.
> * Clients can ask what instances are in the group, i.e., available.
> * Clients are notified when group membership changes so they can mutate 
> their local state.
> These map onto ZooKeeper as:
> * A group is a (permanent) node in the ZooKeeper hierarchy. Clients and 
> services must be told the path to this node.
> * A services joins the group by creating an ephemeral node whose parent 
> is the group node. By using an ephemeral node, if the service dies then the 
> service is automatically removed from the group.
> * The ephemeral node's data contains the service metadata in some format 
> like JSON, XML, Avro, Protobufs, Thrift, etc. ZooKeeper has no equivalent of 
> HTTP's "Content-Type" header to identify the metadata representation, so 
> services and clients must agree upon the format in some manner.
> * Clients can query for the children of the group node to identify the 
> members of the group.
> * Clients can place a watch on the group node to be notified if nodes 
> have joined or left the group.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-653) hudson failure in LETest

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-653:
---

Fix Version/s: (was: 3.3.0)
   3.4.0

Pushing to 3.4.0 per Flavio's comment.

> hudson failure in LETest
> 
>
> Key: ZOOKEEPER-653
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-653
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Reporter: Patrick Hunt
> Fix For: 3.4.0
>
>
> http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/675/testReport/org.apache.zookeeper.test/LETest/testLE/
> junit.framework.AssertionFailedError: Threads didn't join
>   at org.apache.zookeeper.test.LETest.testLE(LETest.java:116)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-680) Including quorum config when standalone leads to crash

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt reassigned ZOOKEEPER-680:
--

Assignee: Patrick Hunt

> Including quorum config when standalone leads to crash
> --
>
> Key: ZOOKEEPER-680
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-680
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.2.2
> Environment: RHEL
>Reporter: Vegard B. Havdal
>Assignee: Patrick Hunt
>Priority: Minor
> Fix For: 3.3.0
>
>
> Include server.#=... and/or a myid file when running standalone, zk server 
> will crash with
> java.lang.NullPointerException
>at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(FastLeaderElection.java:466)
>at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:635)
>at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:515)
> Seen when running zk embedded in other server, using
> String[] args = new String[]{zookeeperCfgFile};
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(args);
> The workaround is of course to fix the config, but 3.1.1 managed to not crash 
> on this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-684) Race in LENonTerminateTest

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-684:
---

Fix Version/s: (was: 3.3.0)
   3.4.0

Pushing to 3.4.0 per Flavio's comment.

> Race in LENonTerminateTest
> --
>
> Key: ZOOKEEPER-684
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-684
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Flavio Paiva Junqueira
> Fix For: 3.4.0
>
> Attachments: zookeeper-684-test-failure.rtf
>
>
> testNonTermination failed during a Hudson run for ZOOKEEPER-59. After 
> inspecting the output, it looks like server is electing 2 as a leader and 
> leaving. Given that 2 is just a mock server, server 0 remains alone in leader 
> election.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-418) Need nifty zookeeper browser

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-418:
---

Fix Version/s: (was: 3.3.0)
   3.4.0

Pushing to 3.4.0, also see ZOOKEEPER-678 which is making good progress.

> Need nifty zookeeper browser
> 
>
> Key: ZOOKEEPER-418
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-418
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Ted Dunning
>Assignee: Ted Dunning
> Fix For: 3.4.0
>
> Attachments: pom.xml, screenshot-1.jpg, zk-view-0.1.tgz, ZooKeeper 
> Eclipse Plug-in.pdf
>
>
> It would be very nice to have a browser that would allow the state of a Zoo 
> to be examined.  Even nice would be such a utility that showed changes in 
> real time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-677) c client doesn't allow ipv6 numeric connect string

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar reassigned ZOOKEEPER-677:
---

Assignee: Mahadev konar  (was: Benjamin Reed)

> c client doesn't allow ipv6 numeric connect string
> --
>
> Key: ZOOKEEPER-677
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-677
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.2.2
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-677.patch
>
>
> The c client doesn't handle ipv6 numeric addresses as they are colon : 
> delmited. After splitting the host/port on : we look for the port as the 
> second entry in the array rather than the last entry in the array.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-684) Race in LENonTerminateTest

2010-03-05 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842068#action_12842068
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-684:
--

The race we observe in the log attached should only happen if the socket times 
out after 200ms in LeaderElection.java. In regular runs, I wouldn't expect it 
to time out, but on a very slow machine it could happen. If we start observing 
it often, we might consider increasing the socket time out value to avoid false 
positives. 

For now, my recommendation is that we don't fix it.

> Race in LENonTerminateTest
> --
>
> Key: ZOOKEEPER-684
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-684
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Flavio Paiva Junqueira
> Fix For: 3.3.0
>
> Attachments: zookeeper-684-test-failure.rtf
>
>
> testNonTermination failed during a Hudson run for ZOOKEEPER-59. After 
> inspecting the output, it looks like server is electing 2 as a leader and 
> leaving. Given that 2 is just a mock server, server 0 remains alone in leader 
> election.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-74) Cleaning/restructuring up Zookeeper server code

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-74?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-74:
---

Fix Version/s: (was: 3.3.0)
   3.4.0

moving this to 3.4. Its definitely worth doing it along with ZOOKEEPER-22 in 
3.4.

> Cleaning/restructuring up Zookeeper server code
> ---
>
> Key: ZOOKEEPER-74
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-74
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 3.4.0
>
>
> I have been thinking this for a while and find that the zookeeper server code 
> needs some cleaning up. The server code is a little tricky/confusing to read 
> sometimes gievn that there is no clearity on ownership of objects. I will put 
> down a proposal for restructuring/cleaning the code up with javadocs so that 
> the code is easier to understand and develop on. comments on what you find 
> confusing are welcome on this jira. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-523) zookeeper c client should shutdown if it sees its zxid is too high from all the server its connecting to.

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-523:


Fix Version/s: (was: 3.3.0)
   3.4.0

moving this to 3.4.

> zookeeper c client should shutdown if it sees its zxid is too high from all 
> the server its connecting to.
> -
>
> Key: ZOOKEEPER-523
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-523
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Mahadev konar
> Fix For: 3.4.0
>
>
> In one of the scenarios, one of our users cleaned up the server database, 
> upgraded the zookeeper servers to 3.* from 2.* and did not shut down there 
> clients. The clients kept spinning since they couldnt find a server that was 
> up to date. Though this was a mistake on the users side, but the spinning of 
> clients caused more problems (like zookeeper server running out of file 
> handles since the clients kept spinning throguh servers). In such a case we 
> should shut down the clients since its this should never happen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-522) zookeeper client should throttle if its not able to connect to any of the servers.

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-522:


Fix Version/s: (was: 3.3.0)
   3.4.0

moving this to 3.4

> zookeeper client should throttle if its not able to connect to any of the 
> servers.
> --
>
> Key: ZOOKEEPER-522
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-522
> Project: Zookeeper
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Mahadev konar
> Fix For: 3.4.0
>
>
> Currently the zookeeper client library keeps connecting to servers if all of 
> them are unreachable. It will go through the list time and again and try to 
> connect. Sometimes, this might cause problems like too many clients retrying 
> connect to servers (and there might be something wrong/delay with servers) 
> wherein the clients will give up and will try reconnecting to other servers. 
> This causes a huge churn in client connections sometimes leading to the 
> zookeeper server running out of file handles.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-688) explain session expiration better in the docs & faq

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-688:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I just committed this. thanks pat!

> explain session expiration better in the docs & faq
> ---
>
> Key: ZOOKEEPER-688
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-688
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-688.patch, ZOOKEEPER-688.patch
>
>
> We are not clear enough (and the diagram we do have seems misleading) on 
> _when_ session expirations are generated. In particular the fact that you 
> only get expirations when the client is connected to the cluster, not when 
> disconnected.
> we need to detail:
> 1) when do you get expiration
> 2) what is the sequence of events that the watcher sees, from disco state, to 
> getting the expiration (say the expiration happens when the client is disco, 
> what do you see in the watcher while you are getting reconnected)
> 3) we need to give some examples of how to test this. We should be explicit 
> that "pulling the network cable" on the client will not show expiration since 
> the cliient will not be reconnected.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-688) explain session expiration better in the docs & faq

2010-03-05 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842046#action_12842046
 ] 

Mahadev konar commented on ZOOKEEPER-688:
-

+1 for the patch. Ill update the FAQ to match whats on the forrest docs.

> explain session expiration better in the docs & faq
> ---
>
> Key: ZOOKEEPER-688
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-688
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-688.patch, ZOOKEEPER-688.patch
>
>
> We are not clear enough (and the diagram we do have seems misleading) on 
> _when_ session expirations are generated. In particular the fact that you 
> only get expirations when the client is connected to the cluster, not when 
> disconnected.
> we need to detail:
> 1) when do you get expiration
> 2) what is the sequence of events that the watcher sees, from disco state, to 
> getting the expiration (say the expiration happens when the client is disco, 
> what do you see in the watcher while you are getting reconnected)
> 3) we need to give some examples of how to test this. We should be explicit 
> that "pulling the network cable" on the client will not show expiration since 
> the cliient will not be reconnected.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-688) explain session expiration better in the docs & faq

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-688:


Attachment: ZOOKEEPER-688.patch

just updated the patch with minor changes on automatic reconnects by the client 
library.

> explain session expiration better in the docs & faq
> ---
>
> Key: ZOOKEEPER-688
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-688
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-688.patch, ZOOKEEPER-688.patch
>
>
> We are not clear enough (and the diagram we do have seems misleading) on 
> _when_ session expirations are generated. In particular the fact that you 
> only get expirations when the client is connected to the cluster, not when 
> disconnected.
> we need to detail:
> 1) when do you get expiration
> 2) what is the sequence of events that the watcher sees, from disco state, to 
> getting the expiration (say the expiration happens when the client is disco, 
> what do you see in the watcher while you are getting reconnected)
> 3) we need to give some examples of how to test this. We should be explicit 
> that "pulling the network cable" on the client will not show expiration since 
> the cliient will not be reconnected.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-663) hudson failure in ZKDatabaseCorruptionTest

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-663:


Attachment: ZOOKEEPER-663.patch

this patch fixes the logging to mention which file is corrupted and then adds 
forrest docs on handling such kind of failures.

> hudson failure in ZKDatabaseCorruptionTest
> --
>
> Key: ZOOKEEPER-663
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-663
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-663.patch
>
>
> http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/686/
> java.lang.RuntimeException: Unable to run quorum server 
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:380)
>   at 
> org.apache.zookeeper.test.ZkDatabaseCorruptionTest.testCorruption(ZkDatabaseCorruptionTest.java:99)
> Caused by: java.io.IOException: Invalid magic number 0 != 1514884167
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:455)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:471)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:438)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:519)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:145)
>   at 
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:193)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:377)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-663) hudson failure in ZKDatabaseCorruptionTest

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-663:


Status: Patch Available  (was: Open)

> hudson failure in ZKDatabaseCorruptionTest
> --
>
> Key: ZOOKEEPER-663
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-663
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-663.patch
>
>
> http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/686/
> java.lang.RuntimeException: Unable to run quorum server 
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:380)
>   at 
> org.apache.zookeeper.test.ZkDatabaseCorruptionTest.testCorruption(ZkDatabaseCorruptionTest.java:99)
> Caused by: java.io.IOException: Invalid magic number 0 != 1514884167
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:455)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:471)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:438)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:519)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:145)
>   at 
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:193)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:377)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-663) hudson failure in ZKDatabaseCorruptionTest

2010-03-05 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841977#action_12841977
 ] 

Mahadev konar commented on ZOOKEEPER-663:
-

This looks like a quorum peer was creting a new txn log file and was shutdown 
in the middle of that. This probably led to corruption of txnlogs in the data 
directory of one of the quorumpeers. We actually do not have a good story with 
the corruption with of the transaction logs. Currently we depend on admins 
manually going to the node and making decisions on how to resolve this.

As a part of this jira we can add documentation in the forrest docs for now, on 
how to deal with such situations. Also, the logging needs to change to point 
which file was corrupted.

> hudson failure in ZKDatabaseCorruptionTest
> --
>
> Key: ZOOKEEPER-663
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-663
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.3.0
>
>
> http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/686/
> java.lang.RuntimeException: Unable to run quorum server 
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:380)
>   at 
> org.apache.zookeeper.test.ZkDatabaseCorruptionTest.testCorruption(ZkDatabaseCorruptionTest.java:99)
> Caused by: java.io.IOException: Invalid magic number 0 != 1514884167
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:455)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:471)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:438)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:519)
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:145)
>   at 
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:193)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:377)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-688) explain session expiration better in the docs & faq

2010-03-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841970#action_12841970
 ] 

Hadoop QA commented on ZOOKEEPER-688:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12438030/ZOOKEEPER-688.patch
  against trunk revision 919280.

+1 @author.  The patch does not contain any @author tags.

+0 tests included.  The patch appears to be a documentation patch that 
doesn't require tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/128/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/128/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/128/console

This message is automatically generated.

> explain session expiration better in the docs & faq
> ---
>
> Key: ZOOKEEPER-688
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-688
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-688.patch
>
>
> We are not clear enough (and the diagram we do have seems misleading) on 
> _when_ session expirations are generated. In particular the fact that you 
> only get expirations when the client is connected to the cluster, not when 
> disconnected.
> we need to detail:
> 1) when do you get expiration
> 2) what is the sequence of events that the watcher sees, from disco state, to 
> getting the expiration (say the expiration happens when the client is disco, 
> what do you see in the watcher while you are getting reconnected)
> 3) we need to give some examples of how to test this. We should be explicit 
> that "pulling the network cable" on the client will not show expiration since 
> the cliient will not be reconnected.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-688) explain session expiration better in the docs & faq

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-688:
---

Status: Patch Available  (was: Open)

I added more detail on how session expiration works.

> explain session expiration better in the docs & faq
> ---
>
> Key: ZOOKEEPER-688
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-688
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-688.patch
>
>
> We are not clear enough (and the diagram we do have seems misleading) on 
> _when_ session expirations are generated. In particular the fact that you 
> only get expirations when the client is connected to the cluster, not when 
> disconnected.
> we need to detail:
> 1) when do you get expiration
> 2) what is the sequence of events that the watcher sees, from disco state, to 
> getting the expiration (say the expiration happens when the client is disco, 
> what do you see in the watcher while you are getting reconnected)
> 3) we need to give some examples of how to test this. We should be explicit 
> that "pulling the network cable" on the client will not show expiration since 
> the cliient will not be reconnected.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-688) explain session expiration better in the docs & faq

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt reassigned ZOOKEEPER-688:
--

Assignee: Patrick Hunt  (was: Benjamin Reed)

> explain session expiration better in the docs & faq
> ---
>
> Key: ZOOKEEPER-688
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-688
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-688.patch
>
>
> We are not clear enough (and the diagram we do have seems misleading) on 
> _when_ session expirations are generated. In particular the fact that you 
> only get expirations when the client is connected to the cluster, not when 
> disconnected.
> we need to detail:
> 1) when do you get expiration
> 2) what is the sequence of events that the watcher sees, from disco state, to 
> getting the expiration (say the expiration happens when the client is disco, 
> what do you see in the watcher while you are getting reconnected)
> 3) we need to give some examples of how to test this. We should be explicit 
> that "pulling the network cable" on the client will not show expiration since 
> the cliient will not be reconnected.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-688) explain session expiration better in the docs & faq

2010-03-05 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-688:
---

Attachment: ZOOKEEPER-688.patch

> explain session expiration better in the docs & faq
> ---
>
> Key: ZOOKEEPER-688
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-688
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Reporter: Patrick Hunt
>Assignee: Benjamin Reed
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-688.patch
>
>
> We are not clear enough (and the diagram we do have seems misleading) on 
> _when_ session expirations are generated. In particular the fact that you 
> only get expirations when the client is connected to the cluster, not when 
> disconnected.
> we need to detail:
> 1) when do you get expiration
> 2) what is the sequence of events that the watcher sees, from disco state, to 
> getting the expiration (say the expiration happens when the client is disco, 
> what do you see in the watcher while you are getting reconnected)
> 3) we need to give some examples of how to test this. We should be explicit 
> that "pulling the network cable" on the client will not show expiration since 
> the cliient will not be reconnected.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-591) The C Client cannot exit properly in some situation

2010-03-05 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841941#action_12841941
 ] 

Mahadev konar commented on ZOOKEEPER-591:
-

ben, did you get a chance to take a look at it ?

> The C Client cannot exit properly in some situation
> ---
>
> Key: ZOOKEEPER-591
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-591
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.2.1
> Environment: Linux db-passport-test05.vm 2.6.9_5-4-0-5 #1 SMP Tue Apr 
> 14 15:56:24 CST 2009 x86_64 x86_64 x86_64 GNU/Linux 
>Reporter: Qian Ye
>Assignee: Benjamin Reed
>Priority: Critical
> Fix For: 3.3.0
>
>
> The following code produce a situation, where the C Client can not exit 
> properly,
> #include "include/zookeeper.h"
> void default_zoo_watcher(zhandle_t *zzh, int type, int state, const char 
> *path, void* context){
> int zrc = 0;
> struct String_vector str_vec = {0, NULL};
> printf("in the default_zoo_watcher\n");
> zrc = zoo_wget_children(zzh, "/mytest", default_zoo_watcher, NULL, 
> &str_vec);
> printf("zoo_wget_children, error: %d\n", zrc);
> return;
> }
> int main()
> {
> int zrc = 0;
> int buff_len = 10; 
> char buff[10] = "hello";
> char path[512];
> struct Stat stat;
> struct String_vector str_vec = {0, NULL};
> zhandle_t *zh = zookeeper_init("10.81.20.62:2181", NULL, 3, 0, 0, 0); 
> zrc = zoo_create(zh, "/mytest", buff, 10, &ZOO_OPEN_ACL_UNSAFE, 0, path, 
> 512);
> printf("zoo_create, error: %d\n", zrc);
> zrc = zoo_wget_children(zh, "/mytest", default_zoo_watcher, NULL, 
> &str_vec);
> printf("zoo_wget_children, error: %d\n", zrc);
> zrc = zoo_create(zh, "/mytest/test1", buff, 10, &ZOO_OPEN_ACL_UNSAFE, 0, 
> path, 512);
> printf("zoo_create, error: %d\n", zrc);
> zrc = zoo_wget_children(zh, "/mytest", default_zoo_watcher, NULL, 
> &str_vec);
> printf("zoo_wget_children, error: %d\n", zrc);
> zrc = zoo_delete(zh, "/mytest/test1", -1);
> printf("zoo_delete, error: %d\n", zrc);
> zookeeper_close(zh);
> return 0;
> }
> running this code can cause the program hang at zookeeper_close(zh);(line 
> 38). using gdb to attach the process, I found that the main thread is waiting 
> for do_completion thread to finish,
> (gdb) bt
> #0  0x00302b806ffb in pthread_join () from /lib64/tls/libpthread.so.0
> #1  0x0040de3b in adaptor_finish (zh=0x515b60) at src/mt_adaptor.c:219
> #2  0x004060ba in zookeeper_close (zh=0x515b60) at 
> src/zookeeper.c:2100
> #3  0x0040220b in main ()
> and the thread which handle the zoo_wget_children(in the default_zoo_watcher) 
> is waiting for sc->cond. 
> (gdb) thread 2
> [Switching to thread 2 (Thread 1094719840 (LWP 25093))]#0  0x00302b8089aa 
> in pthread_cond_wait@@GLIBC_2.3.2 ()
>from /lib64/tls/libpthread.so.0
> (gdb) bt
> #0  0x00302b8089aa in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/tls/libpthread.so.0
> #1  0x0040d88b in wait_sync_completion (sc=0x5167f0) at 
> src/mt_adaptor.c:82
> #2  0x004082c9 in zoo_wget_children (zh=0x515b60, path=0x40ebc0 
> "/mytest", watcher=0x401fd8 , watcherCtx=Variable 
> "watcherCtx" is not available.)
> at src/zookeeper.c:2884
> #3  0x00402037 in default_zoo_watcher ()
> #4  0x0040d664 in deliverWatchers (zh=0x515b60, type=4, state=3, 
> path=0x515100 "/mytest", list=0x5177d8) at src/zk_hashtable.c:274
> #5  0x00403861 in process_completions (zh=0x515b60) at 
> src/zookeeper.c:1631
> #6  0x0040e1b5 in do_completion (v=Variable "v" is not available.) at 
> src/mt_adaptor.c:333
> #7  0x00302b80610a in start_thread () from /lib64/tls/libpthread.so.0
> #8  0x00302afc6003 in clone () from /lib64/tls/libc.so.6
> #9  0x in ?? ()
> here, a deadlock presents.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-602) log all exceptions not caught by ZK threads

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-602:


Fix Version/s: (was: 3.3.0)
   3.4.0

moving this issue to 3.4

> log all exceptions not caught by ZK threads
> ---
>
> Key: ZOOKEEPER-602
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-602
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client, server
>Affects Versions: 3.2.1
>Reporter: Patrick Hunt
>Priority: Critical
> Fix For: 3.4.0
>
>
> the java code should add a ThreadGroup exception handler that logs at ERROR 
> level any uncaught exceptions thrown by Thread run methods.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-517) NIO factory fails to close connections when the number of file handles run out.

2010-03-05 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-517:


Fix Version/s: (was: 3.3.0)
   3.4.0

> NIO factory fails to close connections when the number of file handles run 
> out.
> ---
>
> Key: ZOOKEEPER-517
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-517
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Mahadev konar
>Priority: Critical
> Fix For: 3.4.0
>
>
> The code in NIO factory is such that if we fail to accept a connection due to 
> some reasons (too many file handles maybe one of them) we do not close the 
> connections that are in CLOSE_WAIT. We need to call an explicit close on 
> these sockets and then close them. One of the solutions might be to move doIO 
> before accpet so that we can still close connection even if we cannot accept 
> connections.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Status on upcoming ZK 3.3.0 release

2010-03-05 Thread Patrick Hunt
March 10 is rapidly approaching. There aren't any blockers left, my plan 
is to commit any reviewed "patch availables" and push any remaining 
non-blockers into 3.4.0 on Weds as part of cutting the 3.3.0 release 
candidate. If you have something that you want to get into 3.3.0 you 
need to get the patch in now. Final warning. ;-)


Patrick

Patrick Hunt wrote:
Just a reminder, 3.3.0 is coming up fast. I will re-triage the JIRA list 
 sometime next week with an eye towards reducing the list of "fix for 
3.3.0", pushing non-critical, non-resourced issues to the 3.4.0 release. 
So if you have something you want to get into 3.3.0 that's a non-blocker 
please submit a patch asap.


Patrick

Patrick Hunt wrote:
ZK 3.3.0 is currently slated for March 10th. You can see a JIRA level 
overview here:

https://issues.apache.org/jira/browse/ZOOKEEPER/fixforversion/12313976

Mahadev and I did an initial triage of the 3.3.0 JIRAs today. There 
are currently 77 open issues slated for inclusion in this release, vs 
110 already addressed.


What does this mean for you? If there's a JIRA assigned to you or that 
you created that's listed for 3.3.0 please review it. If you don't 
plan on working on it for 3.3.0 reschedule it (3.4.0 or later), if you 
do plan to work on it please do so (sooner == better). If you want to 
get something into 3.3.0 that's not listed for 3.3.0 please submit a 
patch asap. As the 3.3.0 deadline approaches I will continue triaging 
the issues, in particular I will start pushing out non-blocker JIRAs 
that are not actively being worked on.


Thank you,

Patrick


[jira] Created: (ZOOKEEPER-688) explain session expiration better in the docs & faq

2010-03-05 Thread Patrick Hunt (JIRA)
explain session expiration better in the docs & faq
---

 Key: ZOOKEEPER-688
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-688
 Project: Zookeeper
  Issue Type: Bug
  Components: documentation
Reporter: Patrick Hunt
Assignee: Benjamin Reed
Priority: Critical
 Fix For: 3.3.0


We are not clear enough (and the diagram we do have seems misleading) on _when_ 
session expirations are generated. In particular the fact that you only get 
expirations when the client is connected to the cluster, not when disconnected.

we need to detail:

1) when do you get expiration
2) what is the sequence of events that the watcher sees, from disco state, to 
getting the expiration (say the expiration happens when the client is disco, 
what do you see in the watcher while you are getting reconnected)
3) we need to give some examples of how to test this. We should be explicit 
that "pulling the network cable" on the client will not show expiration since 
the cliient will not be reconnected.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-640) make build.xml more configurable to ease packaging for linux distros

2010-03-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841813#action_12841813
 ] 

Hudson commented on ZOOKEEPER-640:
--

Integrated in ZooKeeper-trunk #715 (See 
[http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/715/])
. make build.xml more configurable to ease packaging for linux distros 
(phunt via mahadev)


> make build.xml more configurable to ease packaging for linux distros
> 
>
> Key: ZOOKEEPER-640
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-640
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.2.1, 3.2.2
>Reporter: Thomas Koch
>Assignee: Patrick Hunt
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-640.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Hi,
> I started packaging Zookeeper for Debian[1][2]. Thereby I had a problem 
> excluding contrib/rest from the build without patching the upstream tarball. 
> Could you please add some properties to your build.xml that allow me to 
> (de)select contribs? In the example below I can easily override the 
> properties:
> 
>   
>   
>   
>   dir="." 
>includes="${contribfilesetincludes}"
>excludes="${contribfilesetexcludes}"
>/>
>   
> 
>  
> 
>   
> Could you please also add a line to project.classpath:
>   
>   
> For Debian I may not compile based on the jar files in lib but must use the 
> jars already in Debian.
> [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561947
> [2] http://git.debian.org/?p=pkg-java/zookeeper.git
> Thank you!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-687) LENonterminatetest fails on some machines.

2010-03-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841814#action_12841814
 ] 

Hudson commented on ZOOKEEPER-687:
--

Integrated in ZooKeeper-trunk #715 (See 
[http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/715/])
. LENonterminatetest fails on some machines. (mahadev)


> LENonterminatetest fails on some machines.
> --
>
> Key: ZOOKEEPER-687
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-687
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-687.patch, ZOOKEEPER-687.patch
>
>
> LENonterminateTest fails with the following error:
> {noformat}
> 2010-03-04 20:26:32,347 - INFO  [Thread-0:leaderelect...@155] - Server 
> address: 0.0.0.0/0.0.0.0:11223
> 2010-03-04 20:26:32,348 - WARN  [Thread-0:leaderelect...@195] - Ignoring 
> exception while looking for leader
> java.io.IOException: Network is unreachable
>   at java.net.PlainDatagramSocketImpl.send(Native Method)
>   at java.net.DatagramSocket.send(DatagramSocket.java:612)
>   at 
> org.apache.zookeeper.server.quorum.LeaderElection.lookForLeader(LeaderElection.java:169)
>   at 
> org.apache.zookeeper.test.LENonTerminateTest$LEThread.run(LENonTerminateTest.java:83)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-681) Minor doc issue re unset maxClientCnxns

2010-03-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841811#action_12841811
 ] 

Hudson commented on ZOOKEEPER-681:
--

Integrated in ZooKeeper-trunk #715 (See 
[http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/715/])
. Minor doc issue re unset maxClientCnxns (phunt via mahadev)


> Minor doc issue re unset maxClientCnxns
> ---
>
> Key: ZOOKEEPER-681
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-681
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.2.2
>Reporter: Vegard B. Havdal
>Assignee: Patrick Hunt
>Priority: Blocker
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-681.patch
>
>
> Just a small issue, the doc says that "Setting this to 0 or omitting it 
> entirely removes the limit on concurrent connections.", but we ran without 
> this setting, and saw: WARN  
> [NIOServerCxn.Factory:2181:nioservercnxn$fact...@226] - Too many connections 
> from /10.76.251.190 - max is 10
> Bug in doc possibly?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-579) zkpython needs more test coverage for ACL code paths

2010-03-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841812#action_12841812
 ] 

Hudson commented on ZOOKEEPER-579:
--

Integrated in ZooKeeper-trunk #715 (See 
[http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/715/])
. zkpython needs more test coverage for ACL code paths (henry via mahadev)


> zkpython needs more test coverage for ACL code paths
> 
>
> Key: ZOOKEEPER-579
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-579
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: contrib-bindings
>Affects Versions: 3.2.1
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: 3.3.0
>
> Attachments: zookeeper-579.patch, zookeeper-579.patch
>
>
> zkpython's tests don't do a good enough job of exercising the ACL code paths. 
> A few new tests that confirm that setACL and friends are working correctly 
> are needed. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.