[jira] Issue Comment Edited: (ZOOKEEPER-234) Eliminate using statics to initialize the sever. Should allow server to be more embeddable in OSGi enviorments.

2009-01-15 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12664125#action_12664125
 ] 

fpj edited comment on ZOOKEEPER-234 at 1/15/09 6:59 AM:
---

+1. I have followed the steps Patrick describes, omitting of course the commit 
step. I have actually been able to connect using jconsole remotely, but in this 
case I had to use more parameters. Here is the command I used:

java -cp .:./zookeeper-dev.jar:/usr/local/apache-log4j-1.2.15/log4j-1.2.15.jar 
-Dlog4j.configuration=log4j_console.properties -Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.port=12122 
-Dcom.sun.management.jmxremote.local.only=false 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.ssl=false 
org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg 

I'll open a jira to document these options upon Pat's request.

  was (Author: fpj):
+1. I have followed the steps Patrick describes, omitting of course the 
commit step. I have actually been able to connect using jconsole remotely, but 
in this case I had to use more parameters. Here is the command I used:

java -cp .:./zookeeper-dev.jar:/usr/local/apache-log4j-1.2.15/log4j-1.2.15.jar 
-Dlog4j.configuration=log4j_console.properties -Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.port=12122 
-Dcom.sun.management.jmxremote.local.only=false 
-Dcom.sun.management.jmxremote.authenticate=false 
org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg 

I'll open a jira to document these options upon Pat's request.
  
 Eliminate using statics to initialize the sever.  Should allow server to be 
 more embeddable in OSGi enviorments.
 

 Key: ZOOKEEPER-234
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-234
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Reporter: Hiram Chirino
Assignee: Patrick Hunt
 Fix For: 3.1.0

 Attachments: ZOOKEEPER-234_step1.patch, ZOOKEEPER-234_step3.patch


 Patrick request I open up this in issue in this [email 
 thread|http://n2.nabble.com/ActiveMQ-is-now-using-ZooKeeper-td1573272.html]
 The main culprit I've noticed is:
 {code}
 ServerStats.registerAsConcrete();
 {code}
 But there may be others.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-275) Bug in FastLeaderElection

2009-01-15 Thread Flavio Paiva Junqueira (JIRA)
Bug in FastLeaderElection
-

 Key: ZOOKEEPER-275
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-275
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira


I found an execution in which leader election does not make progress. Here is 
the problematic scenario:

- We have an ensemble of 3 servers, and we start only 2;
- We let them elect a leader, and then crash the one with lowest id, say S_1 
(call the other S_2);
- We restart the crashed server.

Upon restarting S_1, S_2 has its logical clock more advanced, and S_1 has its 
logical clock set to 1. Once S_1 receives a notification from S_2, it notices 
that it is in the wrong round and it advances its logical clock to the same 
value as S_1. Now, the problem comes exactly in this point because in the 
current code S_1 resets its vote to its initial vote (its own id and zxid). 
Since S_2 has already notified S_1, it won't do it again, and we are stuck. The 
patch I'm submitting fixes this problem by setting the vote of S_1 to the one 
received if it satisfies the total order predicate (received zxid is higher 
or received zxid is the same and received id is higher).

Related to this problem, I noticed that by trying to avoid unnecessary 
notification duplicates, there could be scenarios in which a server fails 
before electing a leader and restarts before leader election succeeds. This 
could happen, for example, when there isn't enough servers available and one 
available crashes and restarts. I fixed this problem in the attached patch by 
allowing a server to send a new batch of notifications if there is at least one 
outgoing queue of pending notifications empty. This is ok because we space out 
consecutive batches of notifications. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-275) Bug in FastLeaderElection

2009-01-15 Thread Flavio Paiva Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Paiva Junqueira updated ZOOKEEPER-275:
-

Attachment: ZOOKEEPER-275.patch

Patch for the problems described. Includes a unit test for the first case. The 
second case is difficult and I still don't have a good idea of how to write a 
unit test for it. In particular, I haven't been able to crash and restart a 
peer in a unit test because when I kill the listener of QuorumCnxManager and 
try to create another instance, it complains that the port is in use. I tried 
using setReuseAddress(true) before binding, but it still doesn't work. 

 Bug in FastLeaderElection
 -

 Key: ZOOKEEPER-275
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-275
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
 Attachments: ZOOKEEPER-275.patch


 I found an execution in which leader election does not make progress. Here is 
 the problematic scenario:
 - We have an ensemble of 3 servers, and we start only 2;
 - We let them elect a leader, and then crash the one with lowest id, say S_1 
 (call the other S_2);
 - We restart the crashed server.
 Upon restarting S_1, S_2 has its logical clock more advanced, and S_1 has its 
 logical clock set to 1. Once S_1 receives a notification from S_2, it notices 
 that it is in the wrong round and it advances its logical clock to the same 
 value as S_1. Now, the problem comes exactly in this point because in the 
 current code S_1 resets its vote to its initial vote (its own id and zxid). 
 Since S_2 has already notified S_1, it won't do it again, and we are stuck. 
 The patch I'm submitting fixes this problem by setting the vote of S_1 to the 
 one received if it satisfies the total order predicate (received zxid is 
 higher or received zxid is the same and received id is higher).
 Related to this problem, I noticed that by trying to avoid unnecessary 
 notification duplicates, there could be scenarios in which a server fails 
 before electing a leader and restarts before leader election succeeds. This 
 could happen, for example, when there isn't enough servers available and one 
 available crashes and restarts. I fixed this problem in the attached patch by 
 allowing a server to send a new batch of notifications if there is at least 
 one outgoing queue of pending notifications empty. This is ok because we 
 space out consecutive batches of notifications. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-273) Zookeeper c client build should not depend on CPPUNIT

2009-01-15 Thread Runping Qi (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Runping Qi updated ZOOKEEPER-273:
-

Attachment: patch_zookeeper_273.txt

 Zookeeper c client build should not depend on CPPUNIT
 -

 Key: ZOOKEEPER-273
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-273
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Reporter: Runping Qi
 Attachments: patch_zookeeper_273.txt


 One should be able to build Zookeeper C client libs on a machine without 
 CPPUNIT installation.
 A simple fix is to remove from configure.ac the following line:
 M_PATH_CPPUNIT(1.10.2)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-234) Eliminate using statics to initialize the sever. Should allow server to be more embeddable in OSGi enviorments.

2009-01-15 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-234:
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed revision 734847.

 Eliminate using statics to initialize the sever.  Should allow server to be 
 more embeddable in OSGi enviorments.
 

 Key: ZOOKEEPER-234
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-234
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Reporter: Hiram Chirino
Assignee: Patrick Hunt
 Fix For: 3.1.0

 Attachments: ZOOKEEPER-234_step1.patch, ZOOKEEPER-234_step3.patch


 Patrick request I open up this in issue in this [email 
 thread|http://n2.nabble.com/ActiveMQ-is-now-using-ZooKeeper-td1573272.html]
 The main culprit I've noticed is:
 {code}
 ServerStats.registerAsConcrete();
 {code}
 But there may be others.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-259) cleanup the logging levels used (use the correct level) and messages generated

2009-01-15 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-259:


Attachment: ZOOKEEPER-259.patch

Fixed to work with latest code.

 cleanup the logging levels used (use the correct level) and messages generated
 --

 Key: ZOOKEEPER-259
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-259
 Project: Zookeeper
  Issue Type: Improvement
  Components: c client, java client, server
Reporter: Patrick Hunt
Assignee: Patrick Hunt
Priority: Minor
 Fix For: 3.1.0

 Attachments: ZOOKEEPER-259.patch, ZOOKEEPER-259.patch, 
 ZOOKEEPER-259.patch


 Cleanup logging:
 make sure logging uses the correct level, esp error and warn
 make sure the messages are meaningful (esp fix fixmsg logs)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-259) cleanup the logging levels used (use the correct level) and messages generated

2009-01-15 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-259:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed revision 734857. 

 cleanup the logging levels used (use the correct level) and messages generated
 --

 Key: ZOOKEEPER-259
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-259
 Project: Zookeeper
  Issue Type: Improvement
  Components: c client, java client, server
Reporter: Patrick Hunt
Assignee: Patrick Hunt
Priority: Minor
 Fix For: 3.1.0

 Attachments: ZOOKEEPER-259.patch, ZOOKEEPER-259.patch, 
 ZOOKEEPER-259.patch


 Cleanup logging:
 make sure logging uses the correct level, esp error and warn
 make sure the messages are meaningful (esp fix fixmsg logs)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Delaying 3.2 release by 2 to 3 weeks?

2009-01-15 Thread Mahadev Konar
Hi all,
  I needed to get quotas in zookeeper 3.2.0 and wanted to see if delaying
the release by 2-3 weeks is ok with everyone?
Here is the jira for it -

http://issues.apache.org/jira/browse/ZOOKEEPER-231

Please respond if you have any issues with the delay.

thanks
mahadev




Re: Delaying 3.1 release by 2 to 3 weeks?

2009-01-15 Thread Mahadev Konar
That was release 3.1 and not 3.2 :)

mahadev


On 1/15/09 4:26 PM, Mahadev Konar maha...@yahoo-inc.com wrote:

 Hi all,
   I needed to get quotas in zookeeper 3.2.0 and wanted to see if delaying
 the release by 2-3 weeks is ok with everyone?
 Here is the jira for it -
 
 http://issues.apache.org/jira/browse/ZOOKEEPER-231
 
 Please respond if you have any issues with the delay.
 
 thanks
 mahadev
 
 



[jira] Updated: (ZOOKEEPER-268) tostring on jute generated objects can cause NPE

2009-01-15 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-268:


Hadoop Flags: [Reviewed]

+1

 tostring on jute generated objects can cause NPE
 

 Key: ZOOKEEPER-268
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-268
 Project: Zookeeper
  Issue Type: Bug
  Components: java client, server
Affects Versions: 3.0.0, 3.0.1
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 3.1.0

 Attachments: ZOOKEEPER-268.patch


 Jute still causing problems with tostring operations on generated code, need 
 to review/cleanup the toCSV code
 From user Kevin Burton:
 -
 Creating this node with this ACL:
 Created /foo
 setAcl /foo world:anyone:w
 Causes the exception included below.
 It's an infinite loop so it's just called over and over again filling my
 console.
 I'm just doing an exists( path, true ); ... setting a watch still causes the
 problem.
 java.lang.NullPointerException
 at org.apache.jute.Utils.toCSVBuffer(Utils.java:234)
 at
 org.apache.jute.CsvOutputArchive.writeBuffer(CsvOutputArchive.java:101)
 at
 org.apache.zookeeper.proto.GetDataResponse.toString(GetDataResponse.java:48)
 at java.lang.String.valueOf(String.java:2827)
 at java.lang.StringBuilder.append(StringBuilder.java:115)
 at
 org.apache.zookeeper.ClientCnxn$Packet.toString(ClientCnxn.java:230)
 at java.lang.String.valueOf(String.java:2827)
 at java.lang.StringBuilder.append(StringBuilder.java:115)
 at
 org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:586)
 at
 org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:626)
 at
 org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:852)
 java.lang.NullPointerException
 at org.apache.jute.Utils.toCSVBuffer(Utils.java:234)
 at
 org.apache.jute.CsvOutputArchive.writeBuffer(CsvOutputArchive.java:101)
 at
 org.apache.zookeeper.proto.GetDataResponse.toString(GetDataResponse.java:48)
 at java.lang.String.valueOf(String.java:2827)
 at java.lang.StringBuilder.append(StringBuilder.java:115)
 at
 org.apache.zookeeper.ClientCnxn$Packet.toString(ClientCnxn.java:230)
 at java.lang.String.valueOf(String.java:2827)
 at java.lang.StringBuilder.append(StringBuilder.java:115)
 at
 org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:586)
 at
 org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:626)
 at
 org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:852)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-275) Bug in FastLeaderElection

2009-01-15 Thread Flavio Paiva Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Paiva Junqueira updated ZOOKEEPER-275:
-

Fix Version/s: 3.1.0

 Bug in FastLeaderElection
 -

 Key: ZOOKEEPER-275
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-275
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
 Fix For: 3.1.0

 Attachments: ZOOKEEPER-275.patch


 I found an execution in which leader election does not make progress. Here is 
 the problematic scenario:
 - We have an ensemble of 3 servers, and we start only 2;
 - We let them elect a leader, and then crash the one with lowest id, say S_1 
 (call the other S_2);
 - We restart the crashed server.
 Upon restarting S_1, S_2 has its logical clock more advanced, and S_1 has its 
 logical clock set to 1. Once S_1 receives a notification from S_2, it notices 
 that it is in the wrong round and it advances its logical clock to the same 
 value as S_1. Now, the problem comes exactly in this point because in the 
 current code S_1 resets its vote to its initial vote (its own id and zxid). 
 Since S_2 has already notified S_1, it won't do it again, and we are stuck. 
 The patch I'm submitting fixes this problem by setting the vote of S_1 to the 
 one received if it satisfies the total order predicate (received zxid is 
 higher or received zxid is the same and received id is higher).
 Related to this problem, I noticed that by trying to avoid unnecessary 
 notification duplicates, there could be scenarios in which a server fails 
 before electing a leader and restarts before leader election succeeds. This 
 could happen, for example, when there isn't enough servers available and one 
 available crashes and restarts. I fixed this problem in the attached patch by 
 allowing a server to send a new batch of notifications if there is at least 
 one outgoing queue of pending notifications empty. This is ok because we 
 space out consecutive batches of notifications. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.