[jira] Created: (ZOOKEEPER-362) Issues with FLENewEpochTest
Issues with FLENewEpochTest --- Key: ZOOKEEPER-362 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 Project: Zookeeper Issue Type: Bug Affects Versions: 3.1.1 Reporter: Flavio Paiva Junqueira Fix For: 3.2.0 I have been able to identify two reasons that cause FLENewEpochTest to fail: 1- There is a race condition that is triggered when two peers try to establish a connection to each other for leader election. Basically, if they start roughly at the same time, the server with highest id will try to open two connections. The two competing connections will lead to one notification message to be lost. This message happens to be critical for this two process scenario; 2- The code to shut down a peer is not working well with the unit tests. For this particular unit test, we need to be able to shut down a peer completely to check the situation the test tries to reproduce. However, it seems that in some runs timing causes the other peers to believe it is still alive, and end up electing it. This peer, however, eventually shuts down and leader election fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-363) NullPointerException when reading/recovering from ledgers with 1 entry
NullPointerException when reading/recovering from ledgers with 1 entry --- Key: ZOOKEEPER-363 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-363 Project: Zookeeper Issue Type: Bug Components: contrib-bookkeeper Affects Versions: 3.1.1 Reporter: Luca Telloli Priority: Minor Getting a NullPointerException when reading from a ledger with 1 entry that has not been properly closed. Patch attached -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-363) NullPointerException when reading/recovering from ledgers with 1 entry
[ https://issues.apache.org/jira/browse/ZOOKEEPER-363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Telloli updated ZOOKEEPER-363: --- Attachment: ZOOKEEPER-363.patch renamed patch previous patch is not a xxx one :) > NullPointerException when reading/recovering from ledgers with 1 entry > --- > > Key: ZOOKEEPER-363 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-363 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bookkeeper >Affects Versions: 3.1.1 >Reporter: Luca Telloli >Priority: Minor > Attachments: ZOOKEEPER-363.patch, ZOOKEEPER-XXX.patch > > > Getting a NullPointerException when reading from a ledger with 1 entry that > has not been properly closed. Patch attached -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-363) NullPointerException when reading/recovering from ledgers with 1 entry
[ https://issues.apache.org/jira/browse/ZOOKEEPER-363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Telloli updated ZOOKEEPER-363: --- Attachment: ZOOKEEPER-XXX.patch patch attached > NullPointerException when reading/recovering from ledgers with 1 entry > --- > > Key: ZOOKEEPER-363 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-363 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bookkeeper >Affects Versions: 3.1.1 >Reporter: Luca Telloli >Priority: Minor > Attachments: ZOOKEEPER-363.patch, ZOOKEEPER-XXX.patch > > > Getting a NullPointerException when reading from a ledger with 1 entry that > has not been properly closed. Patch attached -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-363) NullPointerException when reading/recovering from ledgers with 1 entry
[ https://issues.apache.org/jira/browse/ZOOKEEPER-363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695333#action_12695333 ] Flavio Paiva Junqueira commented on ZOOKEEPER-363: -- The bug is triggered when the last hint is zero. > NullPointerException when reading/recovering from ledgers with 1 entry > --- > > Key: ZOOKEEPER-363 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-363 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bookkeeper >Affects Versions: 3.1.1 >Reporter: Luca Telloli >Priority: Minor > Attachments: ZOOKEEPER-363.patch, ZOOKEEPER-XXX.patch > > > Getting a NullPointerException when reading from a ledger with 1 entry that > has not been properly closed. Patch attached -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-363) NullPointerException when reading/recovering from ledgers with 1 entry
[ https://issues.apache.org/jira/browse/ZOOKEEPER-363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-363: - Assignee: Flavio Paiva Junqueira > NullPointerException when reading/recovering from ledgers with 1 entry > --- > > Key: ZOOKEEPER-363 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-363 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bookkeeper >Affects Versions: 3.1.1 >Reporter: Luca Telloli >Assignee: Flavio Paiva Junqueira >Priority: Minor > Attachments: ZOOKEEPER-363.patch, ZOOKEEPER-XXX.patch > > > Getting a NullPointerException when reading from a ledger with 1 entry that > has not been properly closed. Patch attached -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Build failed in Hudson: ZooKeeper-trunk #270
See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/270/changes Changes: [mahadev] ZOOKEEPER-60. Get cppunit tests running as part of Hudson CI. (girish via mahadev) -- [...truncated 55836 lines...] [junit] 2009-04-03 12:15:37,169 - INFO [main:finalrequestproces...@268] - shutdown of request processor complete [junit] 2009-04-03 12:15:37,169 - INFO [ProcessThread:-1:preprequestproces...@111] - PrepRequestProcessor exited loop! [junit] 2009-04-03 12:15:37,169 - INFO [SyncThread:0:syncrequestproces...@119] - SyncRequestProcessor exited! [junit] 2009-04-03 12:15:37,268 - INFO [main:clientb...@306] - STARTING server [junit] 2009-04-03 12:15:37,268 - INFO [main:zookeeperser...@160] - Created server [junit] 2009-04-03 12:15:37,269 - INFO [main:files...@71] - Reading snapshot http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/ws/trunk/build/test/tmp/test1541817150363981717.junit.dir/version-2/snapshot.0 [junit] 2009-04-03 12:15:37,270 - INFO [main:filetxnsnap...@198] - Snapshotting: 3 [junit] 2009-04-03 12:15:37,272 - INFO [NIOServerCxn.Factory:33221:nioserverc...@635] - Processing stat command from /127.0.0.1:35997 [junit] 2009-04-03 12:15:37,272 - WARN [NIOServerCxn.Factory:33221:nioserverc...@431] - Exception causing close of session 0x0 due to java.io.IOException: Responded to info probe [junit] 2009-04-03 12:15:37,273 - INFO [NIOServerCxn.Factory:33221:nioserverc...@766] - closing session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/127.0.0.1:33221 remote=/127.0.0.1:35997] [junit] 2009-04-03 12:15:38,466 - INFO [main-SendThread:clientcnxn$sendthr...@800] - Attempting connection to server /127.0.0.1:33221 [junit] 2009-04-03 12:15:38,466 - INFO [main-SendThread:clientcnxn$sendthr...@716] - Priming connection to java.nio.channels.SocketChannel[connected local=/127.0.0.1:35998 remote=/127.0.0.1:33221] [junit] 2009-04-03 12:15:38,466 - INFO [main-SendThread:clientcnxn$sendthr...@868] - Server connection successful [junit] 2009-04-03 12:15:38,466 - INFO [NIOServerCxn.Factory:33221:nioserverc...@517] - Connected to /127.0.0.1:35998 lastZxid 3 [junit] 2009-04-03 12:15:38,467 - INFO [NIOServerCxn.Factory:33221:nioserverc...@895] - Finished init of 0x1206be7d6c0 valid:true [junit] 2009-04-03 12:15:38,467 - INFO [NIOServerCxn.Factory:33221:nioserverc...@545] - Renewing session 0x1206be7d6c0 [junit] 2009-04-03 12:15:39,000 - INFO [SessionTracker:sessiontrackeri...@142] - SessionTrackerImpl exited loop! [junit] 2009-04-03 12:15:39,000 - INFO [SessionTracker:sessiontrackeri...@142] - SessionTrackerImpl exited loop! [junit] 2009-04-03 12:16:12,481 - INFO [main:clientb...@300] - STOPPING server [junit] 2009-04-03 12:16:12,481 - INFO [main:nioserverc...@766] - closing session:0x1206be7d6c0 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/127.0.0.1:33221 remote=/127.0.0.1:35998] [junit] 2009-04-03 12:16:12,482 - WARN [main-SendThread:clientcnxn$sendthr...@898] - Exception closing session 0x1206be7d6c0 to sun.nio.ch.selectionkeyi...@68cb6b [junit] java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0 lim=4 cap=4] [junit] at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:632) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:876) [junit] 2009-04-03 12:16:12,482 - INFO [NIOServerCxn.Factory:33221:nioservercnxn$fact...@177] - NIOServerCnxn factory exited run method [junit] 2009-04-03 12:16:12,483 - INFO [main:finalrequestproces...@268] - shutdown of request processor complete [junit] 2009-04-03 12:16:12,483 - INFO [ProcessThread:-1:preprequestproces...@111] - PrepRequestProcessor exited loop! [junit] 2009-04-03 12:16:12,483 - INFO [SyncThread:0:syncrequestproces...@119] - SyncRequestProcessor exited! [junit] 2009-04-03 12:16:12,582 - INFO [main:clientb...@306] - STARTING server [junit] 2009-04-03 12:16:12,583 - INFO [main:zookeeperser...@160] - Created server [junit] 2009-04-03 12:16:12,584 - INFO [main:files...@71] - Reading snapshot http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/ws/trunk/build/test/tmp/test1541817150363981717.junit.dir/version-2/snapshot.3 [junit] 2009-04-03 12:16:12,586 - INFO [main:filetxnsnap...@198] - Snapshotting: 5 [junit] 2009-04-03 12:16:12,588 - INFO [NIOServerCxn.Factory:33221:nioserverc...@635] - Processing stat command from /127.0.0.1:36000 [junit] 2009-04-03 12:16:12,588 - WARN [NIOServerCxn.Factory:33221:nioserverc...@431] - Exception causing close of session 0x0 due to java.io.IOException: Responded to info probe [junit] 2009-04-03 12:16:12,589 - INFO [NIOServerCxn.Factory:33221:nioserverc...@766] - closing session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=
[jira] Updated: (ZOOKEEPER-362) Issues with FLENewEpochTest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-362: - Attachment: ZOOKEEPER-362.patch This patch fixes the problem in the description. More concretely, it does the following: 1- It synchronizes QuorumCnxManager::connectOne so that there are no competing connections to the same server; 2- It doesn't remove an existing connection in QuorumCnxManager::receiveConnection when winning the challenge; 3- it eliminates the second definition of "ss" in QuorumCnxManager::Listener. This was a pretty silly bug (my fault of course); 4- It adds a deadline to semapahores in FLENewEpochTest so that it doesn't wait indefinitely; 5- If thread 0 finishes before thread 1, then thread 1 initiates a new round after waiting for 1s. This is what happens in a real deployment as a follower gives up on its elected leader if the elected leader takes too long to acknowledge its leadership. As we don't run the follower/leader part of the code in this test, moving to the next round doesn't happen automatically. > Issues with FLENewEpochTest > --- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-362) Issues with FLENewEpochTest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-362: - Status: Patch Available (was: Open) > Issues with FLENewEpochTest > --- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-363) NPE when recovering ledger with no hint
[ https://issues.apache.org/jira/browse/ZOOKEEPER-363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-363: - Description: When recovering a ledger, LedgerRecoveryMonitor currently start from the entry preceding the hint. if the hint is zero, then it causes an access out of the bounds of the bookie array in QuorumEngine, leading to the mentioned NPE. was:Getting a NullPointerException when reading from a ledger with 1 entry that has not been properly closed. Patch attached Priority: Major (was: Minor) Fix Version/s: 3.2.0 Summary: NPE when recovering ledger with no hint (was: NullPointerException when reading/recovering from ledgers with 1 entry ) > NPE when recovering ledger with no hint > > > Key: ZOOKEEPER-363 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-363 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bookkeeper >Affects Versions: 3.1.1 >Reporter: Luca Telloli >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-363.patch, ZOOKEEPER-XXX.patch > > > When recovering a ledger, LedgerRecoveryMonitor currently start from the > entry preceding the hint. if the hint is zero, then it causes an access out > of the bounds of the bookie array in QuorumEngine, leading to the mentioned > NPE. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-363) NPE when recovering ledger with no hint
[ https://issues.apache.org/jira/browse/ZOOKEEPER-363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-363: - Attachment: ZOOKEEPER-363.patch This patch also includes a new unit test. > NPE when recovering ledger with no hint > > > Key: ZOOKEEPER-363 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-363 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bookkeeper >Affects Versions: 3.1.1 >Reporter: Luca Telloli >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-363.patch, ZOOKEEPER-363.patch, > ZOOKEEPER-XXX.patch > > > When recovering a ledger, LedgerRecoveryMonitor currently start from the > entry preceding the hint. if the hint is zero, then it causes an access out > of the bounds of the bookie array in QuorumEngine, leading to the mentioned > NPE. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-363) NPE when recovering ledger with no hint
[ https://issues.apache.org/jira/browse/ZOOKEEPER-363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-363: - Attachment: (was: ZOOKEEPER-XXX.patch) > NPE when recovering ledger with no hint > > > Key: ZOOKEEPER-363 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-363 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bookkeeper >Affects Versions: 3.1.1 >Reporter: Luca Telloli >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-363.patch, ZOOKEEPER-363.patch > > > When recovering a ledger, LedgerRecoveryMonitor currently start from the > entry preceding the hint. if the hint is zero, then it causes an access out > of the bounds of the bookie array in QuorumEngine, leading to the mentioned > NPE. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-363) NPE when recovering ledger with no hint
[ https://issues.apache.org/jira/browse/ZOOKEEPER-363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-363: - Status: Patch Available (was: Open) > NPE when recovering ledger with no hint > > > Key: ZOOKEEPER-363 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-363 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bookkeeper >Affects Versions: 3.1.1 >Reporter: Luca Telloli >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-363.patch, ZOOKEEPER-363.patch > > > When recovering a ledger, LedgerRecoveryMonitor currently start from the > entry preceding the hint. if the hint is zero, then it causes an access out > of the bounds of the bookie array in QuorumEngine, leading to the mentioned > NPE. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-362) Issues with FLENewEpochTest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695456#action_12695456 ] Hadoop QA commented on ZOOKEEPER-362: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12404543/ZOOKEEPER-362.patch against trunk revision 761433. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/12/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/12/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/12/console This message is automatically generated. > Issues with FLENewEpochTest > --- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-362) Issues with FLENewEpochTest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695464#action_12695464 ] Benjamin Reed commented on ZOOKEEPER-362: - looks good. can you review the log calls to make sure that they should be info and error? > Issues with FLENewEpochTest > --- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-362) Issues with FLENewEpochTest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-362: - Attachment: ZOOKEEPER-362.patch Thanks, Ben. I've fixed the log calls in this new patch. > Issues with FLENewEpochTest > --- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch, ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-362) Issues with FLENewEpochTest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-362: - Status: Patch Available (was: Open) Re-submitting... > Issues with FLENewEpochTest > --- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch, ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-362) Issues with FLENewEpochTest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-362: - Status: Open (was: Patch Available) > Issues with FLENewEpochTest > --- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch, ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-363) NPE when recovering ledger with no hint
[ https://issues.apache.org/jira/browse/ZOOKEEPER-363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695473#action_12695473 ] Hadoop QA commented on ZOOKEEPER-363: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12404558/ZOOKEEPER-363.patch against trunk revision 761433. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/13/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/13/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/13/console This message is automatically generated. > NPE when recovering ledger with no hint > > > Key: ZOOKEEPER-363 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-363 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bookkeeper >Affects Versions: 3.1.1 >Reporter: Luca Telloli >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-363.patch, ZOOKEEPER-363.patch > > > When recovering a ledger, LedgerRecoveryMonitor currently start from the > entry preceding the hint. if the hint is zero, then it causes an access out > of the bounds of the bookie array in QuorumEngine, leading to the mentioned > NPE. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-362) Issues with FLENewEpochTest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695477#action_12695477 ] Hadoop QA commented on ZOOKEEPER-362: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12404562/ZOOKEEPER-362.patch against trunk revision 761433. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/14/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/14/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/14/console This message is automatically generated. > Issues with FLENewEpochTest > --- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch, ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-361) integrate cppunit testing as part of hudson patch process.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695493#action_12695493 ] Mahadev konar commented on ZOOKEEPER-361: - i dont have access to vesta... can you grant me access, so that I can run the tests on the hudson machines? > integrate cppunit testing as part of hudson patch process. > -- > > Key: ZOOKEEPER-361 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-361 > Project: Zookeeper > Issue Type: New Feature > Components: build >Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.1.1 >Reporter: Mahadev konar >Assignee: Giridharan Kesavan > Attachments: zk-361.patch > > > we need to test the c tests as part of our hudson patch testing process. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (ZOOKEEPER-362) Issues with FLENewEpochTest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-362: -- Assignee: Flavio Paiva Junqueira > Issues with FLENewEpochTest > --- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch, ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-364) command line interface for zookeeper.
command line interface for zookeeper. - Key: ZOOKEEPER-364 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-364 Project: Zookeeper Issue Type: New Feature Reporter: Mahadev konar currently we have a shell based interface for zookeeper (which again isnt well published). we should have a cli based interface for zookeeper. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-364) command line interface for zookeeper.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated ZOOKEEPER-364: Description: currently we have a shell based interface for zookeeper (which again isnt well published). we should have a wee published cli based interface for zookeeper. (was: currently we have a shell based interface for zookeeper (which again isnt well published). we should have a cli based interface for zookeeper.) > command line interface for zookeeper. > - > > Key: ZOOKEEPER-364 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-364 > Project: Zookeeper > Issue Type: New Feature >Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.1.1 >Reporter: Mahadev konar > Fix For: 3.2.0 > > > currently we have a shell based interface for zookeeper (which again isnt > well published). we should have a wee published cli based interface for > zookeeper. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-364) command line interface for zookeeper.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated ZOOKEEPER-364: Description: currently we have a shell based interface for zookeeper (which again isnt well published). we should have a well published cli based interface for zookeeper. (was: currently we have a shell based interface for zookeeper (which again isnt well published). we should have a wee published cli based interface for zookeeper.) > command line interface for zookeeper. > - > > Key: ZOOKEEPER-364 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-364 > Project: Zookeeper > Issue Type: New Feature >Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.1.1 >Reporter: Mahadev konar > Fix For: 3.2.0 > > > currently we have a shell based interface for zookeeper (which again isnt > well published). we should have a well published cli based interface for > zookeeper. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-364) command line interface for zookeeper.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated ZOOKEEPER-364: Affects Version/s: 3.0.0 3.0.1 3.1.0 3.1.1 Fix Version/s: 3.2.0 > command line interface for zookeeper. > - > > Key: ZOOKEEPER-364 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-364 > Project: Zookeeper > Issue Type: New Feature >Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.1.1 >Reporter: Mahadev konar > Fix For: 3.2.0 > > > currently we have a shell based interface for zookeeper (which again isnt > well published). we should have a well published cli based interface for > zookeeper. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-364) command line interface for zookeeper.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695515#action_12695515 ] Patrick Hunt commented on ZOOKEEPER-364: ZOOKEEPER-36 allows for REST based tools to be developed. however this is implemented as a proxy, rather than a core part of the server. so I don't think we should rely on rest for implementing clients (at least default CLI), even though things like python work really well (there's an example tree dumper as part of 36) Also implementing in Java seems to result in significant overhead - ie starting a jvm for each command execution. Perhaps C? Similar to the cli.c we currently have, but a set of command line tools rather than a shell. I'm not sure c would be my first choice, but it would be nice from the perspective of exercising/testing our c binding. Perhaps Perl? Chris has done a great job with the perl binding... however that would put a requirement on external toolset not shipped with the release (cpan). Perhaps some other lang, like python? Basically this forces us to implement python bindings on top of the c intf. So the benefits of exercising c, plus we get python binding, plus the benefits of implementing the commands using a scripting lang? > command line interface for zookeeper. > - > > Key: ZOOKEEPER-364 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-364 > Project: Zookeeper > Issue Type: New Feature >Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.1.1 >Reporter: Mahadev konar > Fix For: 3.2.0 > > > currently we have a shell based interface for zookeeper (which again isnt > well published). we should have a well published cli based interface for > zookeeper. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (ZOOKEEPER-348) Creating node with path ending in "/" with sequence flag set
[ https://issues.apache.org/jira/browse/ZOOKEEPER-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-348: -- Assignee: Patrick Hunt > Creating node with path ending in "/" with sequence flag set > > > Key: ZOOKEEPER-348 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-348 > Project: Zookeeper > Issue Type: Bug > Components: c client >Affects Versions: 3.1.0, 3.1.1 >Reporter: Jeff Terrace >Assignee: Patrick Hunt >Priority: Minor > Fix For: 3.2.0 > > > In 3.0.1, I could create a sequence node like this: > /nodes/001 > like this: > string path = "/nodes/"; > string value = "data"; > int rc = zoo_acreate(zh, path.c_str(), value.c_str(), value.length(), > &ZOO_OPEN_ACL_UNSAFE, ZOO_EPHEMERAL | ZOO_SEQUENCE, &czoo_created, &where); > In 3.1.1, this fails with error -8 (ZBADARGUMENTS). > Adding something after the "/" in the path makes the code work fine: > string path = "/nodes/n"; > I assume something is checking if the path ends in "/" but not checking the > sequence flag. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (ZOOKEEPER-355) make validatePath non public in Zookeeper client api.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-355: -- Assignee: Patrick Hunt > make validatePath non public in Zookeeper client api. > -- > > Key: ZOOKEEPER-355 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-355 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.0, 3.1.1 >Reporter: Mahadev konar >Assignee: Patrick Hunt > Fix For: 3.2.0 > > > make validatePath non public in Zookeeper client api. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-30) Hooks for atomic broadcast protocol
[ https://issues.apache.org/jira/browse/ZOOKEEPER-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695589#action_12695589 ] Andrew Carman commented on ZOOKEEPER-30: We'd be happy to show you what we've got, but we don't think we can deliver it as a patch. We've deleted a large number of files, touched every file in zab, zab/quorum, and zab/persistence, and changed a lot of the jute generated code. We're looking at a way to get you public read access to our repository, but until then is there some other way we could get it to you? We talked with Jean-Luc today and we all thought it might be a good idea for us to come back up to LinkedIn at the end of the semester to present our work to them. Would the Zookeeper team be available to sit in on our presentation and possibly do a code review (like we did last fall, but reversed) on May 11, 12, or 13th? Lastly, we think we've come up with a solution to the returning zxid's problem. Instead of returning zxid's when you propose, we'll return a ZabTxnCookie object which can be used to identify which proposals came from yourself and which came from other nodes. There will be an almost unique local id for each proposal, and when it is committed, it will also get the zxid, which can be used by the application layer as a unique id. We propose the signature below. Any comments or suggestions? {code:title=ZabTxnCookie.java|borderStyle=solid} /** * An identifier for transactions that should be opaque to the user but useful * for comparing if two transactions are the same or not. By design it is used * by systems implementing the Zab and ZabCallback interfaces because a * ZabTxnCookie will be returned when you make a proposal (and a sync) and * then passed when commit is called so that the client can match their * proposals with commits. */ public class ZabTxnCookie { /** * A unique identifier for each server, this is assigned by the config * files when Zab is being set up. */ private long serverId; /** * The zxid assigned by the leader to a committed proposal. This will only * exist on committed proposals once they are passed to deliver. This CAN * be used as a unique identifier for each proposal. */ private long zxid; /** * A probably unique identifier for each proposal. Its most significant * 32-bits are the bottom 32-bits of the system time in milliseconds when * the node starts up (so it's reset each time the server goes down). The * bottom 32-bits are just a counter that's incremented on each proposal. * So this number will not be unique if the server goes down and starts up * exactly n*2^32 milliseconds after the first time (n>0). */ private long localId; public boolean equals(ZabTxnCookie other); /** * Returns a unique identifier for this proposal, however the identifier * is only valid for proposals that have been committed. So this method * should only be called once a transaction is delivered to you, never * just after making a proposal. This identifier is guaranteed to be * sequentially increasing and unique even across server failures. * * @return A unique identifier for this proposal if it has been committed, * otherwise this number is invalid. */ public long getUniqueId(); } {code} > Hooks for atomic broadcast protocol > --- > > Key: ZOOKEEPER-30 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-30 > Project: Zookeeper > Issue Type: New Feature > Components: quorum >Reporter: Patrick Hunt >Assignee: Mahadev konar > > Moved from SourceForge to Apache. > http://sourceforge.net/tracker/index.php?func=detail&aid=1938788&group_id=209147&atid=1008547 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-355) make validatePath non public in Zookeeper client api.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-355: --- Attachment: ZOOKEEPER-355.patch This patch moves the method into common. It also fixes ZOOKEEPER-348 - we now correctly handle path validation when sequence flag is used. c/java and tests for both also updated. > make validatePath non public in Zookeeper client api. > -- > > Key: ZOOKEEPER-355 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-355 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.0, 3.1.1 >Reporter: Mahadev konar >Assignee: Patrick Hunt > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-355.patch > > > make validatePath non public in Zookeeper client api. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-355) make validatePath non public in Zookeeper client api.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-355: --- Status: Patch Available (was: Open) > make validatePath non public in Zookeeper client api. > -- > > Key: ZOOKEEPER-355 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-355 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1, 3.1.0 >Reporter: Mahadev konar >Assignee: Patrick Hunt > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-355.patch > > > make validatePath non public in Zookeeper client api. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-348) Creating node with path ending in "/" with sequence flag set
[ https://issues.apache.org/jira/browse/ZOOKEEPER-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-348: --- Status: Patch Available (was: Open) the fix for this is included in ZOOKEEPER-355 > Creating node with path ending in "/" with sequence flag set > > > Key: ZOOKEEPER-348 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-348 > Project: Zookeeper > Issue Type: Bug > Components: c client >Affects Versions: 3.1.1, 3.1.0 >Reporter: Jeff Terrace >Assignee: Patrick Hunt >Priority: Minor > Fix For: 3.2.0 > > > In 3.0.1, I could create a sequence node like this: > /nodes/001 > like this: > string path = "/nodes/"; > string value = "data"; > int rc = zoo_acreate(zh, path.c_str(), value.c_str(), value.length(), > &ZOO_OPEN_ACL_UNSAFE, ZOO_EPHEMERAL | ZOO_SEQUENCE, &czoo_created, &where); > In 3.1.1, this fails with error -8 (ZBADARGUMENTS). > Adding something after the "/" in the path makes the code work fine: > string path = "/nodes/n"; > I assume something is checking if the path ends in "/" but not checking the > sequence flag. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-362) Issues with FLENewEpochTest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695608#action_12695608 ] Benjamin Reed commented on ZOOKEEPER-362: - +1 ready to commit. > Issues with FLENewEpochTest > --- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch, ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-360) WeakHashMap in Bookie.java causes NPE
[ https://issues.apache.org/jira/browse/ZOOKEEPER-360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695607#action_12695607 ] Mahadev konar commented on ZOOKEEPER-360: - I just committed this. thanks flavio.. > WeakHashMap in Bookie.java causes NPE > - > > Key: ZOOKEEPER-360 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-360 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bookkeeper >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-BOOKKEEPER-360.patch > > > We need a strong reference to prevent a key in masterKeys on Bookie.java to > be garbage collected. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-360) WeakHashMap in Bookie.java causes NPE
[ https://issues.apache.org/jira/browse/ZOOKEEPER-360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated ZOOKEEPER-360: Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) > WeakHashMap in Bookie.java causes NPE > - > > Key: ZOOKEEPER-360 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-360 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bookkeeper >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-BOOKKEEPER-360.patch > > > We need a strong reference to prevent a key in masterKeys on Bookie.java to > be garbage collected. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-362) Issues with FLENewEpochTest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Reed updated ZOOKEEPER-362: Hadoop Flags: [Reviewed] > Issues with FLENewEpochTest > --- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch, ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-363) NPE when recovering ledger with no hint
[ https://issues.apache.org/jira/browse/ZOOKEEPER-363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-363: --- Hadoop Flags: [Reviewed] +1, looks good > NPE when recovering ledger with no hint > > > Key: ZOOKEEPER-363 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-363 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bookkeeper >Affects Versions: 3.1.1 >Reporter: Luca Telloli >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-363.patch, ZOOKEEPER-363.patch > > > When recovering a ledger, LedgerRecoveryMonitor currently start from the > entry preceding the hint. if the hint is zero, then it causes an access out > of the bounds of the bookie array in QuorumEngine, leading to the mentioned > NPE. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-365) javadoc is wrong for setLast in LedgerHandle
javadoc is wrong for setLast in LedgerHandle Key: ZOOKEEPER-365 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-365 Project: Zookeeper Issue Type: Bug Components: contrib-bookkeeper Affects Versions: 3.2.0 Reporter: Patrick Hunt Priority: Minor Fix For: 3.2.0 Note: the javadoc is wrong here: /** * Returns the last entry identifier submitted and increments it. * @return long */ long setLast(long last){ also would be great to have javadoc for the legerrecoverymonitor getNextHint method. I was reviewing this code and it would have been helpful to know what to expect of this method. (possible return values, etc...) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-30) Hooks for atomic broadcast protocol
[ https://issues.apache.org/jira/browse/ZOOKEEPER-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695613#action_12695613 ] Andrew Carman commented on ZOOKEEPER-30: We now have public read access to our codebase: https://svn.cs.hmc.edu/svn/linkedin08/zab-multibranch/ Feel free to look around. It's still quite fluid as we implement the final few features. > Hooks for atomic broadcast protocol > --- > > Key: ZOOKEEPER-30 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-30 > Project: Zookeeper > Issue Type: New Feature > Components: quorum >Reporter: Patrick Hunt >Assignee: Mahadev konar > > Moved from SourceForge to Apache. > http://sourceforge.net/tracker/index.php?func=detail&aid=1938788&group_id=209147&atid=1008547 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-366) Session timeout detection can go wrong if the leader system time changes
Session timeout detection can go wrong if the leader system time changes Key: ZOOKEEPER-366 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-366 Project: Zookeeper Issue Type: Bug Reporter: Benjamin Reed the leader tracks session expirations by calculating when a session will timeout and then periodically checking to see what needs to be timed out based on the current time. this works great as long as the leaders clock progresses at a steady pace. the problem comes when there are big (session size) changes in clock, by ntp for example. if time gets adjusted forward, all the sessions could timeout immediately. if time goes backward sessions that should timeout may take a lot longer to actually expire. this is really just a leader issue. the easiest way to deal with this is to have the leader relinquish leadership if it detects a big jump forward in time. when a new leader gets elected, it will recalculate timeouts of active sessions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-362) Issues with FLENewEpochTest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated ZOOKEEPER-362: Resolution: Fixed Status: Resolved (was: Patch Available) +1 to the patch. i just committed this. thanks flavio. > Issues with FLENewEpochTest > --- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Flavio Paiva Junqueira >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch, ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-355) make validatePath non public in Zookeeper client api.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695618#action_12695618 ] Hadoop QA commented on ZOOKEEPER-355: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12404602/ZOOKEEPER-355.patch against trunk revision 761811. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/15/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/15/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/15/console This message is automatically generated. > make validatePath non public in Zookeeper client api. > -- > > Key: ZOOKEEPER-355 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-355 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.0, 3.1.1 >Reporter: Mahadev konar >Assignee: Patrick Hunt > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-355.patch > > > make validatePath non public in Zookeeper client api. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-348) Creating node with path ending in "/" with sequence flag set
[ https://issues.apache.org/jira/browse/ZOOKEEPER-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695619#action_12695619 ] Hadoop QA commented on ZOOKEEPER-348: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org against trunk revision 761816. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/16/console This message is automatically generated. > Creating node with path ending in "/" with sequence flag set > > > Key: ZOOKEEPER-348 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-348 > Project: Zookeeper > Issue Type: Bug > Components: c client >Affects Versions: 3.1.0, 3.1.1 >Reporter: Jeff Terrace >Assignee: Patrick Hunt >Priority: Minor > Fix For: 3.2.0 > > > In 3.0.1, I could create a sequence node like this: > /nodes/001 > like this: > string path = "/nodes/"; > string value = "data"; > int rc = zoo_acreate(zh, path.c_str(), value.c_str(), value.length(), > &ZOO_OPEN_ACL_UNSAFE, ZOO_EPHEMERAL | ZOO_SEQUENCE, &czoo_created, &where); > In 3.1.1, this fails with error -8 (ZBADARGUMENTS). > Adding something after the "/" in the path makes the code work fine: > string path = "/nodes/n"; > I assume something is checking if the path ends in "/" but not checking the > sequence flag. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-30) Hooks for atomic broadcast protocol
[ https://issues.apache.org/jira/browse/ZOOKEEPER-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695620#action_12695620 ] Patrick Hunt commented on ZOOKEEPER-30: --- There are really 2 reasons why we need you submit as a patch if you want the changes included in future releases of Apache ZooKeeper: 1) we need the code to be submitted through JIRA for legal reasons. In particular when you submit the changes you need to check off the box that says: Grant license to ASF for inclusion in ASF works (as per the Apache License ยง5) Contributions intended for inclusion in ASF products (eg. patches, code) must be licensed to ASF under the terms of the Apache License. Other attachments (eg. log dumps, test cases) need not be. You can submit multiple patches, as well as a script/description of how to apply. Here's an example: https://issues.apache.org/jira/browse/ZOOKEEPER-234?focusedCommentId=12663566&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12663566 2) we don't know your changes as well as you do. How are we going to apply them if you can't? We are very interested to review/include your changes. We'd be happy to help with any advice/support. > Hooks for atomic broadcast protocol > --- > > Key: ZOOKEEPER-30 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-30 > Project: Zookeeper > Issue Type: New Feature > Components: quorum >Reporter: Patrick Hunt >Assignee: Mahadev konar > > Moved from SourceForge to Apache. > http://sourceforge.net/tracker/index.php?func=detail&aid=1938788&group_id=209147&atid=1008547 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-355) make validatePath non public in Zookeeper client api.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695621#action_12695621 ] Patrick Hunt commented on ZOOKEEPER-355: the failed test is fletestepochwhich is unrelated to my changes and known to be failing. Please review/commit this. > make validatePath non public in Zookeeper client api. > -- > > Key: ZOOKEEPER-355 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-355 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.0, 3.1.1 >Reporter: Mahadev konar >Assignee: Patrick Hunt > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-355.patch > > > make validatePath non public in Zookeeper client api. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-355) make validatePath non public in Zookeeper client api.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-355: --- Status: Open (was: Patch Available) > make validatePath non public in Zookeeper client api. > -- > > Key: ZOOKEEPER-355 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-355 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1, 3.1.0 >Reporter: Mahadev konar >Assignee: Patrick Hunt > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-355.patch > > > make validatePath non public in Zookeeper client api. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-355) make validatePath non public in Zookeeper client api.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-355: --- Status: Patch Available (was: Open) resubmitting - mahadev says fleepochtest should be fixed now. > make validatePath non public in Zookeeper client api. > -- > > Key: ZOOKEEPER-355 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-355 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1, 3.1.0 >Reporter: Mahadev konar >Assignee: Patrick Hunt > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-355.patch > > > make validatePath non public in Zookeeper client api. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-355) make validatePath non public in Zookeeper client api.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695628#action_12695628 ] Hadoop QA commented on ZOOKEEPER-355: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12404602/ZOOKEEPER-355.patch against trunk revision 761816. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/17/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/17/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/17/console This message is automatically generated. > make validatePath non public in Zookeeper client api. > -- > > Key: ZOOKEEPER-355 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-355 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.0, 3.1.1 >Reporter: Mahadev konar >Assignee: Patrick Hunt > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-355.patch > > > make validatePath non public in Zookeeper client api. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-30) Hooks for atomic broadcast protocol
[ https://issues.apache.org/jira/browse/ZOOKEEPER-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695631#action_12695631 ] Mahadev konar commented on ZOOKEEPER-30: i andrew, as pat pointed out that we would not be able to merge an external branch without a code grant as we have in patch submissions. would it be possible for you guys to break up the patch like - 1) patch for changes in persistence 2) patch for changes in quorum something liek that? if not creating a single patch is fine... We would like to include your changes in Zookeeper but it would be difficult for us to find bandwidth to review an external repository. Also it would be great if you can include the list of changes (concretely) you have made for Zas on this jira. Also, we should be able to meet with you later in may.. we can discuss that outside of this jira... > Hooks for atomic broadcast protocol > --- > > Key: ZOOKEEPER-30 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-30 > Project: Zookeeper > Issue Type: New Feature > Components: quorum >Reporter: Patrick Hunt >Assignee: Mahadev konar > > Moved from SourceForge to Apache. > http://sourceforge.net/tracker/index.php?func=detail&aid=1938788&group_id=209147&atid=1008547 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (ZOOKEEPER-30) Hooks for atomic broadcast protocol
[ https://issues.apache.org/jira/browse/ZOOKEEPER-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695631#action_12695631 ] Mahadev konar edited comment on ZOOKEEPER-30 at 4/3/09 3:53 PM: hi andrew, as pat pointed out that we would not be able to merge an external branch without a code grant as we have in patch submissions. would it be possible for you guys to break up the patch like - 1) patch for changes in persistence 2) patch for changes in quorum something liek that? if not creating a single patch is fine... We would like to include your changes in Zookeeper but it would be difficult for us to find bandwidth to review an external repository. Also it would be great if you can include the list of changes (concretely) you have made for Zas on this jira. Also, we should be able to meet with you later in may.. we can discuss that outside of this jira... was (Author: mahadev): i andrew, as pat pointed out that we would not be able to merge an external branch without a code grant as we have in patch submissions. would it be possible for you guys to break up the patch like - 1) patch for changes in persistence 2) patch for changes in quorum something liek that? if not creating a single patch is fine... We would like to include your changes in Zookeeper but it would be difficult for us to find bandwidth to review an external repository. Also it would be great if you can include the list of changes (concretely) you have made for Zas on this jira. Also, we should be able to meet with you later in may.. we can discuss that outside of this jira... > Hooks for atomic broadcast protocol > --- > > Key: ZOOKEEPER-30 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-30 > Project: Zookeeper > Issue Type: New Feature > Components: quorum >Reporter: Patrick Hunt >Assignee: Mahadev konar > > Moved from SourceForge to Apache. > http://sourceforge.net/tracker/index.php?func=detail&aid=1938788&group_id=209147&atid=1008547 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-30) Hooks for atomic broadcast protocol
[ https://issues.apache.org/jira/browse/ZOOKEEPER-30?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Farmer updated ZOOKEEPER-30: --- Attachment: zab.diff > Hooks for atomic broadcast protocol > --- > > Key: ZOOKEEPER-30 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-30 > Project: Zookeeper > Issue Type: New Feature > Components: quorum >Reporter: Patrick Hunt >Assignee: Mahadev konar > Attachments: zab.diff > > > Moved from SourceForge to Apache. > http://sourceforge.net/tracker/index.php?func=detail&aid=1938788&group_id=209147&atid=1008547 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-30) Hooks for atomic broadcast protocol
[ https://issues.apache.org/jira/browse/ZOOKEEPER-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695645#action_12695645 ] Andrew Farmer commented on ZOOKEEPER-30: Other Andrew here. (I'm also on the HMC clinic team.) Basically, the issue is that a lot of the changes we made have been kind of "patch-unfriendly" - we've moved and renamed a lot of files, and that can't really be reflected well by a patch file. (We tried generating a straight patch between our repository and yours, and it ended up being something like 5 MB.) With that all in mind, though, I'm attaching a REALLY ROUGH patch that simply adds our current version of Zab, as well as its respective tests, to the current SVN trunk revision of Zookeeper. Hopefully this should resolve the legal issues. What it doesn't do is: 1) It doesn't make Zookeeper use Zab for anything. As a result, there's a lot of duplicated code now - Zookeeper *will* need to be modified significantly to run against the Zab API. 2) It also doesn't port in some of the changes you folks have made to code that's within Zab's ambit. (What's included is basically everything that doesn't involve either clients or the data tree: leader election, proposal handling, and logging/persistence.) 3) Finally, it's not quite complete. We're still working on implementing syncs, as well as doing some further tests. Hopefully this is enough to start taking a look at, though... > Hooks for atomic broadcast protocol > --- > > Key: ZOOKEEPER-30 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-30 > Project: Zookeeper > Issue Type: New Feature > Components: quorum >Reporter: Patrick Hunt >Assignee: Mahadev konar > Attachments: zab.diff > > > Moved from SourceForge to Apache. > http://sourceforge.net/tracker/index.php?func=detail&aid=1938788&group_id=209147&atid=1008547 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (ZOOKEEPER-30) Hooks for atomic broadcast protocol
[ https://issues.apache.org/jira/browse/ZOOKEEPER-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695645#action_12695645 ] Andrew Farmer edited comment on ZOOKEEPER-30 at 4/3/09 5:01 PM: Other Andrew here. (I'm also on the HMC clinic team.) Basically, the issue is that a lot of the changes we made have been kind of "patch-unfriendly" - we've moved and renamed a lot of files, and that can't really be reflected well by a patch file. (We tried generating a straight patch between our repository and yours, and it ended up being something like 5 MB.) With that all in mind, though, I'm attaching a REALLY ROUGH patch that simply adds our current version of Zab, as well as its respective tests, to the current SVN trunk revision of Zookeeper. Hopefully this should resolve the legal issues. What it doesn't do is: 1) It doesn't make Zookeeper use Zab for anything. As a result, there's a lot of duplicated code now - Zookeeper *will* need to be modified significantly to run against the Zab API. All it does is add a bunch of code to the source tree. 2) It also doesn't port in some of the changes you folks have made to code that's within Zab's ambit. (What's included is basically everything that doesn't involve either clients or the data tree: leader election, proposal handling, and logging/persistence.) 3) Finally, it's not quite complete. We're still working on implementing syncs, as well as doing some further tests. Hopefully this is enough to start taking a look at, though... we'll keep you updated. was (Author: andfarm): Other Andrew here. (I'm also on the HMC clinic team.) Basically, the issue is that a lot of the changes we made have been kind of "patch-unfriendly" - we've moved and renamed a lot of files, and that can't really be reflected well by a patch file. (We tried generating a straight patch between our repository and yours, and it ended up being something like 5 MB.) With that all in mind, though, I'm attaching a REALLY ROUGH patch that simply adds our current version of Zab, as well as its respective tests, to the current SVN trunk revision of Zookeeper. Hopefully this should resolve the legal issues. What it doesn't do is: 1) It doesn't make Zookeeper use Zab for anything. As a result, there's a lot of duplicated code now - Zookeeper *will* need to be modified significantly to run against the Zab API. 2) It also doesn't port in some of the changes you folks have made to code that's within Zab's ambit. (What's included is basically everything that doesn't involve either clients or the data tree: leader election, proposal handling, and logging/persistence.) 3) Finally, it's not quite complete. We're still working on implementing syncs, as well as doing some further tests. Hopefully this is enough to start taking a look at, though... > Hooks for atomic broadcast protocol > --- > > Key: ZOOKEEPER-30 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-30 > Project: Zookeeper > Issue Type: New Feature > Components: quorum >Reporter: Patrick Hunt >Assignee: Mahadev konar > Attachments: zab.diff > > > Moved from SourceForge to Apache. > http://sourceforge.net/tracker/index.php?func=detail&aid=1938788&group_id=209147&atid=1008547 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-367) RecoveryTest failure - "unreasonable length" IOException
[ https://issues.apache.org/jira/browse/ZOOKEEPER-367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-367: --- Attachment: rec.tar.gz TEST-org.apache.zookeeper.test.RecoveryTest.txt > RecoveryTest failure - "unreasonable length" IOException > > > Key: ZOOKEEPER-367 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-367 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.0 > Environment: ubuntu 8.10 intrpid ibex, jvm 1.6.0_10 >Reporter: Patrick Hunt >Priority: Critical > Fix For: 3.2.0 > > Attachments: rec.tar.gz, > TEST-org.apache.zookeeper.test.RecoveryTest.txt > > > during local testing I received the attached recoverytest failure -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-367) RecoveryTest failure - "unreasonable length" IOException
RecoveryTest failure - "unreasonable length" IOException Key: ZOOKEEPER-367 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-367 Project: Zookeeper Issue Type: Bug Components: server Affects Versions: 3.2.0 Environment: ubuntu 8.10 intrpid ibex, jvm 1.6.0_10 Reporter: Patrick Hunt Priority: Critical Fix For: 3.2.0 Attachments: rec.tar.gz, TEST-org.apache.zookeeper.test.RecoveryTest.txt during local testing I received the attached recoverytest failure -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-364) command line interface for zookeeper.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695668#action_12695668 ] Chris Darroch commented on ZOOKEEPER-364: - Just FYI on the Perl side of things, CPAN is just a distribution channel -- the Perl module could easily be donated to the ASF and shipped as part of ZooKeeper, if there was interest. Since Perl users tend to look at CPAN first, though, I wanted to put it there as well. Things like mod_perl are generally available from both apache.org and cpan.org but maintained by the ASF. Since most Linux/Unix folks will have Perl installed by default -- and Net::ZooKeeper really is only going to be simple to build on Linux/Unix anyway, because of the pthread requirement in the C API -- it's not hard to build a command-line interface; in fact, pretty simple. Ephemerals, sequences, watches, and ACLs all are supported by the module. That said, personally, I think lots of people will want a Python binding as well -- I just happened to need a Perl one first. I caution that building it took more of my time than I expected; getting things like watches to work took a number of tries. Because the multi-threaded C API runs two private threads in the background to handle IO (especially pings), you can't allow them to just arbitrarily make callbacks upon a watch event notification into Perl code -- that could fry the interpreter's state over in the "main" thread. And you want to be sure that if the user decides to abandon some higher-level object that represents the watch (e.g., lets it go out of scope before the event notification occurs) then when the watch event does come in you haven't thrown away the private structure the callback is expecting to update. There may be a niftier way to handle this with Python since Python supports threading better than Perl 5.x, by using the single-threaded stub adapter and doing all the calls to zookeeper_interest(), zookeeper_process(), etc. from the Python module, i.e., implementing the event loops in Python threads spawned by the module. Still, it's not a cakewalk, I suspect. (I was kidding about the Parrot implementation, but I'm also intrigued by the possibility that 5 years from now these sorts of conversations might be moot, if one could write a module in Parrot code that implemented the event loop and was then usable with Parrot-compiled versions of Python, Ruby, Perl, etc. When time permits (ha!) I've been meaning to pull down Parrot 1.0 and peek at its concurrency and extension mechanisms.) > command line interface for zookeeper. > - > > Key: ZOOKEEPER-364 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-364 > Project: Zookeeper > Issue Type: New Feature >Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.1.1 >Reporter: Mahadev konar > Fix For: 3.2.0 > > > currently we have a shell based interface for zookeeper (which again isnt > well published). we should have a well published cli based interface for > zookeeper. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-36) REST access to ZooKeeper
[ https://issues.apache.org/jira/browse/ZOOKEEPER-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695669#action_12695669 ] Chris Darroch commented on ZOOKEEPER-36: Another option might be the expand on the mod_shmap and mod_socache_zookeeper httpd modules I wrote a while back. The latter maintains a ZooKeeper client connection for each httpd child process -- these are shared across all HTTP requests handled by the process, so (as with the code attached to this issue, I think) ephemeral nodes aren't supported, nor are ACLs, watches, etc. The code is available under the Apache license at http://people.apache.org/~chrisd/projects/shared_map/. The shared-map module can harness a variety of "small object cache" providers to various parts of the URL namespace and then perform GET/PUT/DELETE against them. For the mod_socache_zookeeper provider these map to zoo_get(), zoo_set()/zoo_create(), and zoo_delete(). Nodes are created automatically when a PUT is made for a non-extant node. I need to refactor mod_socache_zookeeper and create a mod_zookeeper which deals with the business of starting/stopping ZooKeeper connections for each httpd child process, something like mod_dbd does for SQL DB connections. That will allow other modules to then acquire the ZK connection and make zoo_*() requests directly; mod_socache_zookeeper and mod_slotmem_zookeeper (yet to be written) then just devolve into the business of mapping URLs to specific ZK calls. For a REST-style interface that supported things like ACLs, sequences, stat data, etc. one could write a separate module (mod_zookeeper_rest or whatever) which supports a more complex mapping than is available through just the socache or slotmem APIs. However the REST interface is implemented, it would be nice, I think, to use HEAD -> zoo_exists(), GET -> zoo_get(), PUT -> zoo_set()/zoo_create(), and DELETE -> zoo_delete(). There's such a natural mapping of HTTP methods to ZK methods that it would seem to call out for use, as opposed to using bags of CGI arguments to POST requests or what have you. > REST access to ZooKeeper > > > Key: ZOOKEEPER-36 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-36 > Project: Zookeeper > Issue Type: New Feature > Components: contrib >Reporter: Patrick Hunt >Assignee: Patrick Hunt > Fix For: 3.2.0 > > Attachments: rest_2.tar.gz > > > Moved from SourceForge to Apache. > http://sourceforge.net/tracker/index.php?func=detail&aid=1961763&group_id=209147&atid=1008547 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-36) REST access to ZooKeeper
[ https://issues.apache.org/jira/browse/ZOOKEEPER-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695670#action_12695670 ] Chris Darroch commented on ZOOKEEPER-36: Sorry, just to throw in a couple of additional thoughts; using httpd modules means all you need is a conventional Apache httpd instance. Requests look like (using the current minimalistic shmap/socache modules): {noformat} GET /node1/node2 HTTP/1.0 DELETE /node1/node2 HTTP/1.0 PUT /node1/node3 HTTP/1.0 Content-Length: 5 12345 {noformat} > REST access to ZooKeeper > > > Key: ZOOKEEPER-36 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-36 > Project: Zookeeper > Issue Type: New Feature > Components: contrib >Reporter: Patrick Hunt >Assignee: Patrick Hunt > Fix For: 3.2.0 > > Attachments: rest_2.tar.gz > > > Moved from SourceForge to Apache. > http://sourceforge.net/tracker/index.php?func=detail&aid=1961763&group_id=209147&atid=1008547 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.