[jira] Issue Comment Edited: (ZOOKEEPER-234) Eliminate using statics to initialize the sever. Should allow server to be more embeddable in OSGi enviorments.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12664125#action_12664125 ] fpj edited comment on ZOOKEEPER-234 at 1/15/09 6:59 AM: --- +1. I have followed the steps Patrick describes, omitting of course the commit step. I have actually been able to connect using jconsole remotely, but in this case I had to use more parameters. Here is the command I used: java -cp .:./zookeeper-dev.jar:/usr/local/apache-log4j-1.2.15/log4j-1.2.15.jar -Dlog4j.configuration=log4j_console.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=12122 -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg I'll open a jira to document these options upon Pat's request. was (Author: fpj): +1. I have followed the steps Patrick describes, omitting of course the commit step. I have actually been able to connect using jconsole remotely, but in this case I had to use more parameters. Here is the command I used: java -cp .:./zookeeper-dev.jar:/usr/local/apache-log4j-1.2.15/log4j-1.2.15.jar -Dlog4j.configuration=log4j_console.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=12122 -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=false org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg I'll open a jira to document these options upon Pat's request. Eliminate using statics to initialize the sever. Should allow server to be more embeddable in OSGi enviorments. Key: ZOOKEEPER-234 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-234 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Hiram Chirino Assignee: Patrick Hunt Fix For: 3.1.0 Attachments: ZOOKEEPER-234_step1.patch, ZOOKEEPER-234_step3.patch Patrick request I open up this in issue in this [email thread|http://n2.nabble.com/ActiveMQ-is-now-using-ZooKeeper-td1573272.html] The main culprit I've noticed is: {code} ServerStats.registerAsConcrete(); {code} But there may be others. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-275) Bug in FastLeaderElection
Bug in FastLeaderElection - Key: ZOOKEEPER-275 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-275 Project: Zookeeper Issue Type: Bug Components: leaderElection Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira I found an execution in which leader election does not make progress. Here is the problematic scenario: - We have an ensemble of 3 servers, and we start only 2; - We let them elect a leader, and then crash the one with lowest id, say S_1 (call the other S_2); - We restart the crashed server. Upon restarting S_1, S_2 has its logical clock more advanced, and S_1 has its logical clock set to 1. Once S_1 receives a notification from S_2, it notices that it is in the wrong round and it advances its logical clock to the same value as S_1. Now, the problem comes exactly in this point because in the current code S_1 resets its vote to its initial vote (its own id and zxid). Since S_2 has already notified S_1, it won't do it again, and we are stuck. The patch I'm submitting fixes this problem by setting the vote of S_1 to the one received if it satisfies the total order predicate (received zxid is higher or received zxid is the same and received id is higher). Related to this problem, I noticed that by trying to avoid unnecessary notification duplicates, there could be scenarios in which a server fails before electing a leader and restarts before leader election succeeds. This could happen, for example, when there isn't enough servers available and one available crashes and restarts. I fixed this problem in the attached patch by allowing a server to send a new batch of notifications if there is at least one outgoing queue of pending notifications empty. This is ok because we space out consecutive batches of notifications. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-275) Bug in FastLeaderElection
[ https://issues.apache.org/jira/browse/ZOOKEEPER-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-275: - Attachment: ZOOKEEPER-275.patch Patch for the problems described. Includes a unit test for the first case. The second case is difficult and I still don't have a good idea of how to write a unit test for it. In particular, I haven't been able to crash and restart a peer in a unit test because when I kill the listener of QuorumCnxManager and try to create another instance, it complains that the port is in use. I tried using setReuseAddress(true) before binding, but it still doesn't work. Bug in FastLeaderElection - Key: ZOOKEEPER-275 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-275 Project: Zookeeper Issue Type: Bug Components: leaderElection Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Attachments: ZOOKEEPER-275.patch I found an execution in which leader election does not make progress. Here is the problematic scenario: - We have an ensemble of 3 servers, and we start only 2; - We let them elect a leader, and then crash the one with lowest id, say S_1 (call the other S_2); - We restart the crashed server. Upon restarting S_1, S_2 has its logical clock more advanced, and S_1 has its logical clock set to 1. Once S_1 receives a notification from S_2, it notices that it is in the wrong round and it advances its logical clock to the same value as S_1. Now, the problem comes exactly in this point because in the current code S_1 resets its vote to its initial vote (its own id and zxid). Since S_2 has already notified S_1, it won't do it again, and we are stuck. The patch I'm submitting fixes this problem by setting the vote of S_1 to the one received if it satisfies the total order predicate (received zxid is higher or received zxid is the same and received id is higher). Related to this problem, I noticed that by trying to avoid unnecessary notification duplicates, there could be scenarios in which a server fails before electing a leader and restarts before leader election succeeds. This could happen, for example, when there isn't enough servers available and one available crashes and restarts. I fixed this problem in the attached patch by allowing a server to send a new batch of notifications if there is at least one outgoing queue of pending notifications empty. This is ok because we space out consecutive batches of notifications. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-273) Zookeeper c client build should not depend on CPPUNIT
[ https://issues.apache.org/jira/browse/ZOOKEEPER-273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Runping Qi updated ZOOKEEPER-273: - Attachment: patch_zookeeper_273.txt Zookeeper c client build should not depend on CPPUNIT - Key: ZOOKEEPER-273 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-273 Project: Zookeeper Issue Type: Bug Components: c client Reporter: Runping Qi Attachments: patch_zookeeper_273.txt One should be able to build Zookeeper C client libs on a machine without CPPUNIT installation. A simple fix is to remove from configure.ac the following line: M_PATH_CPPUNIT(1.10.2) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-234) Eliminate using statics to initialize the sever. Should allow server to be more embeddable in OSGi enviorments.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-234: --- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed revision 734847. Eliminate using statics to initialize the sever. Should allow server to be more embeddable in OSGi enviorments. Key: ZOOKEEPER-234 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-234 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Hiram Chirino Assignee: Patrick Hunt Fix For: 3.1.0 Attachments: ZOOKEEPER-234_step1.patch, ZOOKEEPER-234_step3.patch Patrick request I open up this in issue in this [email thread|http://n2.nabble.com/ActiveMQ-is-now-using-ZooKeeper-td1573272.html] The main culprit I've noticed is: {code} ServerStats.registerAsConcrete(); {code} But there may be others. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-259) cleanup the logging levels used (use the correct level) and messages generated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Reed updated ZOOKEEPER-259: Attachment: ZOOKEEPER-259.patch Fixed to work with latest code. cleanup the logging levels used (use the correct level) and messages generated -- Key: ZOOKEEPER-259 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-259 Project: Zookeeper Issue Type: Improvement Components: c client, java client, server Reporter: Patrick Hunt Assignee: Patrick Hunt Priority: Minor Fix For: 3.1.0 Attachments: ZOOKEEPER-259.patch, ZOOKEEPER-259.patch, ZOOKEEPER-259.patch Cleanup logging: make sure logging uses the correct level, esp error and warn make sure the messages are meaningful (esp fix fixmsg logs) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-259) cleanup the logging levels used (use the correct level) and messages generated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Reed updated ZOOKEEPER-259: Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed revision 734857. cleanup the logging levels used (use the correct level) and messages generated -- Key: ZOOKEEPER-259 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-259 Project: Zookeeper Issue Type: Improvement Components: c client, java client, server Reporter: Patrick Hunt Assignee: Patrick Hunt Priority: Minor Fix For: 3.1.0 Attachments: ZOOKEEPER-259.patch, ZOOKEEPER-259.patch, ZOOKEEPER-259.patch Cleanup logging: make sure logging uses the correct level, esp error and warn make sure the messages are meaningful (esp fix fixmsg logs) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Delaying 3.2 release by 2 to 3 weeks?
Hi all, I needed to get quotas in zookeeper 3.2.0 and wanted to see if delaying the release by 2-3 weeks is ok with everyone? Here is the jira for it - http://issues.apache.org/jira/browse/ZOOKEEPER-231 Please respond if you have any issues with the delay. thanks mahadev
Re: Delaying 3.1 release by 2 to 3 weeks?
That was release 3.1 and not 3.2 :) mahadev On 1/15/09 4:26 PM, Mahadev Konar maha...@yahoo-inc.com wrote: Hi all, I needed to get quotas in zookeeper 3.2.0 and wanted to see if delaying the release by 2-3 weeks is ok with everyone? Here is the jira for it - http://issues.apache.org/jira/browse/ZOOKEEPER-231 Please respond if you have any issues with the delay. thanks mahadev
[jira] Updated: (ZOOKEEPER-268) tostring on jute generated objects can cause NPE
[ https://issues.apache.org/jira/browse/ZOOKEEPER-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Reed updated ZOOKEEPER-268: Hadoop Flags: [Reviewed] +1 tostring on jute generated objects can cause NPE Key: ZOOKEEPER-268 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-268 Project: Zookeeper Issue Type: Bug Components: java client, server Affects Versions: 3.0.0, 3.0.1 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 3.1.0 Attachments: ZOOKEEPER-268.patch Jute still causing problems with tostring operations on generated code, need to review/cleanup the toCSV code From user Kevin Burton: - Creating this node with this ACL: Created /foo setAcl /foo world:anyone:w Causes the exception included below. It's an infinite loop so it's just called over and over again filling my console. I'm just doing an exists( path, true ); ... setting a watch still causes the problem. java.lang.NullPointerException at org.apache.jute.Utils.toCSVBuffer(Utils.java:234) at org.apache.jute.CsvOutputArchive.writeBuffer(CsvOutputArchive.java:101) at org.apache.zookeeper.proto.GetDataResponse.toString(GetDataResponse.java:48) at java.lang.String.valueOf(String.java:2827) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.zookeeper.ClientCnxn$Packet.toString(ClientCnxn.java:230) at java.lang.String.valueOf(String.java:2827) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:586) at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:626) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:852) java.lang.NullPointerException at org.apache.jute.Utils.toCSVBuffer(Utils.java:234) at org.apache.jute.CsvOutputArchive.writeBuffer(CsvOutputArchive.java:101) at org.apache.zookeeper.proto.GetDataResponse.toString(GetDataResponse.java:48) at java.lang.String.valueOf(String.java:2827) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.zookeeper.ClientCnxn$Packet.toString(ClientCnxn.java:230) at java.lang.String.valueOf(String.java:2827) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:586) at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:626) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:852) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-275) Bug in FastLeaderElection
[ https://issues.apache.org/jira/browse/ZOOKEEPER-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-275: - Fix Version/s: 3.1.0 Bug in FastLeaderElection - Key: ZOOKEEPER-275 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-275 Project: Zookeeper Issue Type: Bug Components: leaderElection Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Fix For: 3.1.0 Attachments: ZOOKEEPER-275.patch I found an execution in which leader election does not make progress. Here is the problematic scenario: - We have an ensemble of 3 servers, and we start only 2; - We let them elect a leader, and then crash the one with lowest id, say S_1 (call the other S_2); - We restart the crashed server. Upon restarting S_1, S_2 has its logical clock more advanced, and S_1 has its logical clock set to 1. Once S_1 receives a notification from S_2, it notices that it is in the wrong round and it advances its logical clock to the same value as S_1. Now, the problem comes exactly in this point because in the current code S_1 resets its vote to its initial vote (its own id and zxid). Since S_2 has already notified S_1, it won't do it again, and we are stuck. The patch I'm submitting fixes this problem by setting the vote of S_1 to the one received if it satisfies the total order predicate (received zxid is higher or received zxid is the same and received id is higher). Related to this problem, I noticed that by trying to avoid unnecessary notification duplicates, there could be scenarios in which a server fails before electing a leader and restarts before leader election succeeds. This could happen, for example, when there isn't enough servers available and one available crashes and restarts. I fixed this problem in the attached patch by allowing a server to send a new batch of notifications if there is at least one outgoing queue of pending notifications empty. This is ok because we space out consecutive batches of notifications. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.