Ruby client binding for ZooKeeper now available.
Sorry if this is a dup for those of you following me on twitter (http://twitter.com/phunt) but I wanted to let you know that twitter (the company) has contributed a Ruby client binding for ZooKeeper. You can learn more about the gem here: http://bit.ly/b9VB6k Regards, Patrick
[jira] Assigned: (ZOOKEEPER-552) add performance benchmark/docs for synchronous operations
[ https://issues.apache.org/jira/browse/ZOOKEEPER-552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-552: -- Assignee: Benjamin Reed Ben any chance you can include this in the std performance benchmarking you do as part of the release? add performance benchmark/docs for synchronous operations - Key: ZOOKEEPER-552 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-552 Project: Zookeeper Issue Type: Task Components: c client, java client, server Reporter: Patrick Hunt Assignee: Benjamin Reed Fix For: 3.3.0 we currently benchmark async operations, but not sync. it would be good to benchmark sync operations so that users know what to expect/tradeoffs. Also, afaik we currently only benchmark java client - we should benchmark the c client as well (if only to ensure perf is comperable) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-515) Zookeeper quorum didn't provide service when restart after an Out of memory crash
[ https://issues.apache.org/jira/browse/ZOOKEEPER-515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-515: --- Fix Version/s: (was: 3.3.0) 3.4.0 Zookeeper quorum didn't provide service when restart after an Out of memory crash --- Key: ZOOKEEPER-515 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-515 Project: Zookeeper Issue Type: Bug Components: server Affects Versions: 3.2.0 Environment: Linux 2.6.9-52bs-4core #2 SMP Wed Jan 16 14:44:08 EST 2008 x86_64 x86_64 x86_64 GNU/Linux Jdk: 1.6.0_14 Reporter: Qian Ye Fix For: 3.4.0 The Zookeeper quorum, containing 5 servers, didn't provide service when restart after an Out of memory crash. It happened as following: 1. we built a Zookeeper quorum which contained 5 servers, say 1, 3, 4, 5, 6 (have no 2), and 6 was the leader. 2. we created 18 threads on 6 different servers to set and get data from a znode in the Zookeeper at the same time. The size of the data is 1MB. The test threads did their job as fast as possible, no pause between two operation, and they repeated the setting and getting 4000 times. 3. the Zookeeper leader crashed about 10 mins after the test threads started. The leader printed out the log: 2009-08-25 12:00:12,301 - WARN [NIOServerCxn.Factory:2181:nioserverc...@497] - Exception causing close of session 0x523 4223c2dc00b5 due to java.io.IOException: Read error 2009-08-25 12:00:12,318 - WARN [NIOServerCxn.Factory:2181:nioserverc...@497] - Exception causing close of session 0x523 4223c2dc00b6 due to java.io.IOException: Read error 2009-08-25 12:03:44,086 - WARN [NIOServerCxn.Factory:2181:nioserverc...@497] - Exception causing close of session 0x523 4223c2dc00b8 due to java.io.IOException: Read error 2009-08-25 12:04:53,757 - WARN [NIOServerCxn.Factory:2181:nioserverc...@497] - Exception causing close of session 0x523 4223c2dc00b7 due to java.io.IOException: Read error 2009-08-25 12:15:45,151 - FATAL [SyncThread:0:syncrequestproces...@131] - Severe unrecoverable error, exiting java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2786) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:71) at java.io.DataOutputStream.writeInt(DataOutputStream.java:180) at org.apache.jute.BinaryOutputArchive.writeInt(BinaryOutputArchive.java:55) at org.apache.zookeeper.txn.SetDataTxn.serialize(SetDataTxn.java:42) at org.apache.zookeeper.server.persistence.Util.marshallTxnEntry(Util.java:262) at org.apache.zookeeper.server.persistence.FileTxnLog.append(FileTxnLog.java:154) at org.apache.zookeeper.server.persistence.FileTxnSnapLog.append(FileTxnSnapLog.java:268) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:100) It is clear that the leader ran out of memory. then the server 4 was down almost at the same time, and printed out the log: 2009-08-25 12:15:45,995 - ERROR [FollowerRequestProcessor:3:followerrequestproces...@91] - Unexpected exception causing exit java.net.SocketException: Connection reset at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) at java.io.DataOutputStream.write(DataOutputStream.java:90) at java.io.FilterOutputStream.write(FilterOutputStream.java:80) at org.apache.jute.BinaryOutputArchive.writeBuffer(BinaryOutputArchive.java:119) at org.apache.zookeeper.server.quorum.QuorumPacket.serialize(QuorumPacket.java:51) at org.apache.jute.BinaryOutputArchive.writeRecord(BinaryOutputArchive.java:123) at org.apache.zookeeper.server.quorum.Follower.writePacket(Follower.java:97) at org.apache.zookeeper.server.quorum.Follower.request(Follower.java:399) at org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run(FollowerRequestProcessor.java:86) 2009-08-25 12:15:45,996 - WARN [NIOServerCxn.Factory:2181:nioserverc...@497] - Exception causing close of session 0x423 4ab894330075 due to java.net.SocketException: Broken pipe 2009-08-25 12:15:45,996 - FATAL [SyncThread:3:syncrequestproces...@131] - Severe unrecoverable error, exiting java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
[jira] Updated: (ZOOKEEPER-474) add compile, test, and improved package targets to zkperl build.xml
[ https://issues.apache.org/jira/browse/ZOOKEEPER-474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-474: --- Fix Version/s: (was: 3.3.0) 3.4.0 add compile, test, and improved package targets to zkperl build.xml --- Key: ZOOKEEPER-474 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-474 Project: Zookeeper Issue Type: Improvement Components: contrib Affects Versions: 3.2.0 Reporter: Chris Darroch Assignee: Chris Darroch Fix For: 3.4.0 Attachments: zk474_testout.txt, ZOOKEEPER-474.patch This patch adds compile and test targets to the zkperl build.xml, and tweaks the package target a little to use the manifest file. For me, ant compile, ant test, and ant clean all work (from scratch, in each case) when using Ant in the local src/contrib/zkperl directory. Further, ant package in the top-level directory seems to continue to build zkperl along with everything else, and leaves out the build.xml and t/zkServer.sh files, which is appropriate. From what I can see, the top-level build.xml doesn't actually invoke the test-contrib target, so I'm not sure if there's a way to integrate the zkperl tests into the main hudson automated test process, but that would be ideal, if at all possible. I feel like I've seen comments to the effect that the zkpython tests are run automatically, but I'm not sure if that's actually true or not. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-575) remove System.exit calls to make the server more container friendly
[ https://issues.apache.org/jira/browse/ZOOKEEPER-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-575: --- Fix Version/s: (was: 3.3.0) 3.4.0 remove System.exit calls to make the server more container friendly --- Key: ZOOKEEPER-575 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-575 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Patrick Hunt Fix For: 3.4.0 There are a handful of places left in the code that still use System.exit, we should remove these to make the server more container friendly. There are some legitimate places for the exits - in *Main.java for example should be fine - these are the command line main routines. Containers should be embedding code that runs just below this layer (or we should refactor so that it would). The tricky bit is ensuring the server shuts down in case of an unrecoverable error occurring, afaik these are the locations where we still have sys exit calls. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-636) configure.ac has instructions which override the contents of CFLAGS and CXXFLAGS.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-636: --- Fix Version/s: (was: 3.3.0) 3.4.0 configure.ac has instructions which override the contents of CFLAGS and CXXFLAGS. - Key: ZOOKEEPER-636 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-636 Project: Zookeeper Issue Type: Improvement Components: build, c client Affects Versions: 3.2.1 Reporter: Maxim P. Dementiev Assignee: Maxim P. Dementiev Fix For: 3.4.0 The information mustn't be overridden. The template like «CFLAGS=$CFLAGS -some-option» should be used. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-642) exceeded deadline by N ms floods logs
[ https://issues.apache.org/jira/browse/ZOOKEEPER-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-642: --- Fix Version/s: (was: 3.3.0) 3.4.0 exceeded deadline by N ms floods logs --- Key: ZOOKEEPER-642 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-642 Project: Zookeeper Issue Type: Bug Components: c client Affects Versions: 3.2.1 Environment: virtualized linux - ec2 - ubuntu Reporter: Dale Johnson Fix For: 3.4.0 More important zookeeper warnings are drown out by the following several times per minute: 2010-01-12 17:39:57,227:22317(0x4147eb90):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 13ms Perhaps this is an issue with the way virtualized systems manage gettimeofday results? Maybe the current 10ms threshold could be pushed up a bit. I notice that 95% of the messages are below 50ms. Is there an obvious configuration change that I can make to fix this? config file below: # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. dataDir=/mnt/zookeeper # the port at which the clients will connect clientPort=2181 server.1=hbase.1:2888:3888 server.2=hbase.2:2888:3888 server.3=hbase.3:2888:3888 server.4=hbase.4:2888:3888 server.5=hbase.5:2888:3888 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-666) Unsafe publication in client API
[ https://issues.apache.org/jira/browse/ZOOKEEPER-666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-666: --- Fix Version/s: (was: 3.3.0) 3.4.0 Unsafe publication in client API Key: ZOOKEEPER-666 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-666 Project: Zookeeper Issue Type: Bug Components: java client Affects Versions: 3.2.2 Reporter: Martin Traverso Fix For: 3.4.0 The following code may result in a data race due to unsafe publication of a reference to this. The call to cnxn.start() spawns threads that have access to the partially-constructed reference to the ZooKeeper object. See http://www.ibm.com/developerworks/java/library/j-jtp0618.html for some background info. {noformat} public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher) throws IOException { . cnxn = new ClientCnxn(connectString, sessionTimeout, this, watchManager); cnxn.start(); } {noformat} The obvious fix is to move the call to cnxn.start() into a separate start() method. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-526) Change the access pattern in the DataNode from direct access to the use of getters and setters
[ https://issues.apache.org/jira/browse/ZOOKEEPER-526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-526: --- Fix Version/s: (was: 3.3.0) 3.4.0 Change the access pattern in the DataNode from direct access to the use of getters and setters -- Key: ZOOKEEPER-526 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-526 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Erik Holstad Priority: Minor Fix For: 3.4.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-525) Changing children SetString to Setbyte[] in DataNode
[ https://issues.apache.org/jira/browse/ZOOKEEPER-525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-525: --- Fix Version/s: (was: 3.3.0) 3.4.0 Changing children SetString to Setbyte[] in DataNode Key: ZOOKEEPER-525 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-525 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Erik Holstad Priority: Minor Fix For: 3.4.0 For every instance of string there is an overhead of 48B compared to using byte[], on a 64 bit system, that seems unnecessary. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-271) Better command line parsing in ZookeeperMain.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-271: --- Fix Version/s: (was: 3.3.0) 3.4.0 Better command line parsing in ZookeeperMain. - Key: ZOOKEEPER-271 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-271 Project: Zookeeper Issue Type: Improvement Components: java client Affects Versions: 3.0.0, 3.0.1 Reporter: Mahadev konar Priority: Minor Fix For: 3.4.0 The command line parsing in zookeepermain is very basic.We should use some kind of cli parsing (commons-cli?) or something else that is standard and improve our command line parsing. This will remove the scattered code that we have in zookeepermain and we will have much better command line parsing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-616) Provide a function to parse out the name and the sequence number from a zknode path
[ https://issues.apache.org/jira/browse/ZOOKEEPER-616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-616: --- Fix Version/s: (was: 3.3.0) 3.4.0 Provide a function to parse out the name and the sequence number from a zknode path --- Key: ZOOKEEPER-616 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-616 Project: Zookeeper Issue Type: New Feature Components: c client, java client Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.1.1, 3.2.0, 3.2.1 Reporter: Avery Ching Priority: Minor Fix For: 3.4.0 Given a zookeeper path and knowing it was created with the SEQUENCE flag, it would be nice to be able to get the sequence number and the name. Currently, it is not documented how many bytes the sequence number uses in the path (Mahadev told me 10 for 3.1.1 for example), and having a function to retrieve this data would hide the actual number of bytes used and provide the useful functionality for users. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-262) unnecesssarily complex reentrant zookeeper_close() logic
[ https://issues.apache.org/jira/browse/ZOOKEEPER-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-262: --- Fix Version/s: (was: 3.3.0) 3.4.0 unnecesssarily complex reentrant zookeeper_close() logic Key: ZOOKEEPER-262 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-262 Project: Zookeeper Issue Type: Improvement Components: c client Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.2.0, 4.0.0 Reporter: Chris Darroch Priority: Minor Fix For: 3.4.0 Attachments: ZOOKEEPER-262.patch, ZOOKEEPER-262.patch, zookeeper-close.patch While working on a wrapper for the C API I puzzled over the problem of how to determine when the multi-threaded adaptor's IO and completion threads had exited. Looking at the code in api_epilog() and adaptor_finish() it seemed clear that any thread could be the last one out the door, and whichever was last would turn out the lights by calling zookeeper_close(). However, on further examination I found that in fact, the close_requested flag guards entry to zookeeper_close() in api_epilog(), and close_requested can only be set non-zero within zookeeper_close(). Thus, only the user's main thread can invoke zookeeper_close() and kick off the shutdown process. When that happens, zookeeper_close() then invokes adaptor_finish() and returns ZOK immediately afterward. Since adaptor_finish() is only called in this one context, it means all the code in that function to check pthread_self() and call pthread_detach() if the current thread is the IO or completion thread is redundant. The adaptor_finish() function always signals and then waits to join with the IO and completion threads because it can only be called by the user's main thread. After joining with the two internal threads, adaptor_finish() calls api_epilog(), which might seem like a trivial final action. However, this is actually where all the work gets done, because in this one case, api_epilog() sees a non-zero close_requested flag value and invokes zookeeper_close(). Note that zookeeper_close() is already on the stack; this is a re-entrant invocation. This time around, zookeeper_close() skips the call to adaptor_finish() -- assuming the reference count has been properly decremented to zero! -- and does the actual final cleanup steps, including deallocating the zh structure. Fortunately, none of the callers on the stack (api_epilog(), adaptor_finish(), and the first zookeeper_close()) touches zh after this. This all works OK, and in particular, the fact that I can be certain that the IO and completion threads have exited after zookeeper_close() returns is great. So too is the fact that those threads can't invoke zookeeper_close() without my knowing about it. However, the actual mechanics of the shutdown seem unnecessarily complex. I'd be worried a bit about a new maintainer looking at adaptor_finish() and reasonably concluding that it can be called by any thread, including the IO and completion ones. Or thinking that the zh handle can still be used after that innocuous-looking call to adaptor_finish() in zookeeper_close() -- the one that actually causes all the work to be done and the handle to be deallocated! I'll attach a patch which I think simplifies the code a bit and makes the shutdown mechanics a little more clear, and might prevent unintentional errors in the future. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-463) C++ tests can't be built on Mac OS using XCode command line tools
[ https://issues.apache.org/jira/browse/ZOOKEEPER-463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-463: --- Fix Version/s: (was: 3.3.0) 3.4.0 C++ tests can't be built on Mac OS using XCode command line tools - Key: ZOOKEEPER-463 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-463 Project: Zookeeper Issue Type: Bug Components: tests Affects Versions: 3.2.0 Environment: Using latest XCode 3.1.3. [apache-zookeeper/bin]$ ld -v @(#)PROGRAM:ld PROJECT:ld64-85.2.1 Reporter: Henry Robinson Priority: Minor Fix For: 3.4.0 --wrap is an unsupported command line flag for ld on Mac OS. The cppunit tests therefore won't build. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-545) investigate use of realtime gc as the recommened default for server vm
[ https://issues.apache.org/jira/browse/ZOOKEEPER-545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-545: --- Fix Version/s: (was: 3.3.0) 3.4.0 Still not 100% sure on this one, pushing to 3.4.0. investigate use of realtime gc as the recommened default for server vm -- Key: ZOOKEEPER-545 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-545 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Patrick Hunt Priority: Critical Fix For: 3.4.0 We currently don't recommend that ppl use the realtime gc when running the server, we probably should. Before we do so we need to verify that it works. We should make it the default for all our tests. concurrent vs g2 or whatever it's called (new in 1.6_15 or something?) Update all scripts to specify this option update documentation to include this option and add section in the dev/ops docs detailing it's benefits (in particular latency effects of gc) Also, -server option? any benefit for us to recommend this as well? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-569) Failure of elected leader can lead to never-ending leader election
[ https://issues.apache.org/jira/browse/ZOOKEEPER-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835970#action_12835970 ] Patrick Hunt commented on ZOOKEEPER-569: Henry, there are two patches, please highlight which one the review should review. thx Failure of elected leader can lead to never-ending leader election -- Key: ZOOKEEPER-569 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-569 Project: Zookeeper Issue Type: Bug Reporter: Henry Robinson Assignee: Henry Robinson Fix For: 3.3.0 Attachments: zookeeper-569.patch, ZOOKEEPER-569.patch, zookeeper-569.patch, zookeeper-569.patch, zookeeper-569.patch, zookeeper-569.patch It is possible for basic LeaderElection to enter a situation where it never terminates. As an example, consider a three node cluster A, B and C. 1. In the first round, A votes for A, B votes for B and C votes for C 2. Since C B A, all nodes resolve to vote for C in the second round as there is no first round winner 3. A, B vote for C, but C fails. 4. C is not elected because neither A nor B hear from it, and so votes for it are discarded 5. A and B never reset their votes, despite not hearing from C, so continue to vote for it ad infinitum. Step 5 is the bug. If A and B reset their votes to themselves in the case where the heard-from vote set is empty, leader election will continue. I do not know if this affects running ZK clusters, as it is possible that the out-of-band failure detection protocols may cause leader election to be restarted anyhow, but I've certainly seen this in tests. I have a trivial patch which fixes it, but it needs a test (and tests for race conditions are hard to write!) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-569) Failure of elected leader can lead to never-ending leader election
[ https://issues.apache.org/jira/browse/ZOOKEEPER-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-569: --- Status: Patch Available (was: Open) Failure of elected leader can lead to never-ending leader election -- Key: ZOOKEEPER-569 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-569 Project: Zookeeper Issue Type: Bug Reporter: Henry Robinson Assignee: Henry Robinson Fix For: 3.3.0 Attachments: zookeeper-569.patch, ZOOKEEPER-569.patch, zookeeper-569.patch, zookeeper-569.patch, zookeeper-569.patch, zookeeper-569.patch It is possible for basic LeaderElection to enter a situation where it never terminates. As an example, consider a three node cluster A, B and C. 1. In the first round, A votes for A, B votes for B and C votes for C 2. Since C B A, all nodes resolve to vote for C in the second round as there is no first round winner 3. A, B vote for C, but C fails. 4. C is not elected because neither A nor B hear from it, and so votes for it are discarded 5. A and B never reset their votes, despite not hearing from C, so continue to vote for it ad infinitum. Step 5 is the bug. If A and B reset their votes to themselves in the case where the heard-from vote set is empty, leader election will continue. I do not know if this affects running ZK clusters, as it is possible that the out-of-band failure detection protocols may cause leader election to be restarted anyhow, but I've certainly seen this in tests. I have a trivial patch which fixes it, but it needs a test (and tests for race conditions are hard to write!) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-543) Tests for ZooKeeper examples
[ https://issues.apache.org/jira/browse/ZOOKEEPER-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-543: --- Status: Patch Available (was: Open) Tests for ZooKeeper examples Key: ZOOKEEPER-543 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-543 Project: Zookeeper Issue Type: New Feature Components: tests Affects Versions: 3.3.0 Reporter: Steven Cheng Assignee: Steven Cheng Priority: Minor Fix For: 3.3.0 Attachments: ZOOKEEPER-543.patch, ZOOKEEPER-543.patch, ZOOKEEPER-543.patch Initial attempt to create ZooKeeper tests based on the example code on the website. Current plan is to test features used in examples using ZooKeeper calls directly. Another approach would be to make more usable abstractions such as those in src/recipes and test those. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-543) Tests for ZooKeeper examples
[ https://issues.apache.org/jira/browse/ZOOKEEPER-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-543: --- Status: Open (was: Patch Available) Tests for ZooKeeper examples Key: ZOOKEEPER-543 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-543 Project: Zookeeper Issue Type: New Feature Components: tests Affects Versions: 3.3.0 Reporter: Steven Cheng Assignee: Steven Cheng Priority: Minor Fix For: 3.3.0 Attachments: ZOOKEEPER-543.patch, ZOOKEEPER-543.patch, ZOOKEEPER-543.patch Initial attempt to create ZooKeeper tests based on the example code on the website. Current plan is to test features used in examples using ZooKeeper calls directly. Another approach would be to make more usable abstractions such as those in src/recipes and test those. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-543) Tests for ZooKeeper examples
[ https://issues.apache.org/jira/browse/ZOOKEEPER-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835971#action_12835971 ] Patrick Hunt commented on ZOOKEEPER-543: Steven, only reason I can think of is that the mocks are not setup to handle sequential? Tests for ZooKeeper examples Key: ZOOKEEPER-543 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-543 Project: Zookeeper Issue Type: New Feature Components: tests Affects Versions: 3.3.0 Reporter: Steven Cheng Assignee: Steven Cheng Priority: Minor Fix For: 3.3.0 Attachments: ZOOKEEPER-543.patch, ZOOKEEPER-543.patch, ZOOKEEPER-543.patch Initial attempt to create ZooKeeper tests based on the example code on the website. Current plan is to test features used in examples using ZooKeeper calls directly. Another approach would be to make more usable abstractions such as those in src/recipes and test those. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-640) make build.xml more configurable to ease packaging for linux distros
[ https://issues.apache.org/jira/browse/ZOOKEEPER-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835974#action_12835974 ] Patrick Hunt commented on ZOOKEEPER-640: No update from Thomas - committer please review/commit this asap. make build.xml more configurable to ease packaging for linux distros Key: ZOOKEEPER-640 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-640 Project: Zookeeper Issue Type: Improvement Components: build Affects Versions: 3.2.1, 3.2.2 Reporter: Thomas Koch Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-640.patch Original Estimate: 0.25h Remaining Estimate: 0.25h Hi, I started packaging Zookeeper for Debian[1][2]. Thereby I had a problem excluding contrib/rest from the build without patching the upstream tarball. Could you please add some properties to your build.xml that allow me to (de)select contribs? In the example below I can easily override the properties: project name=zookeepercontrib property name=contribfilesetincludes value=*/build.xml / property name=contribfilesetexcludes value= / fileset id=contribfileset dir=. includes=${contribfilesetincludes} excludes=${contribfilesetexcludes} / target name=compile subant target=jar fileset refid=contribfileset / /subant /target Could you please also add a line to project.classpath: path id=project.classpath fileset dir=${additional.lib.dir} includes=*.jar/ For Debian I may not compile based on the jar files in lib but must use the jars already in Debian. [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561947 [2] http://git.debian.org/?p=pkg-java/zookeeper.git Thank you! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-669) watchedevent tostring should clearly output the state/type/path
[ https://issues.apache.org/jira/browse/ZOOKEEPER-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-669: --- Status: Patch Available (was: Open) watchedevent tostring should clearly output the state/type/path --- Key: ZOOKEEPER-669 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-669 Project: Zookeeper Issue Type: Bug Affects Versions: 3.2.2, 3.1.2 Reporter: Patrick Hunt Assignee: Patrick Hunt Priority: Critical Fix For: 3.3.0 Attachments: ZOOKEEPER-669.patch the current tostring method is broken -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-669) watchedevent tostring should clearly output the state/type/path
[ https://issues.apache.org/jira/browse/ZOOKEEPER-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-669: --- Status: Open (was: Patch Available) watchedevent tostring should clearly output the state/type/path --- Key: ZOOKEEPER-669 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-669 Project: Zookeeper Issue Type: Bug Affects Versions: 3.2.2, 3.1.2 Reporter: Patrick Hunt Assignee: Patrick Hunt Priority: Critical Fix For: 3.3.0 Attachments: ZOOKEEPER-669.patch the current tostring method is broken -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-601) allow configuration of session timeout min/max bounds
[ https://issues.apache.org/jira/browse/ZOOKEEPER-601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-601: --- Status: Patch Available (was: Open) allow configuration of session timeout min/max bounds - Key: ZOOKEEPER-601 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-601 Project: Zookeeper Issue Type: Improvement Components: server Affects Versions: 3.2.1 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-601.patch ZK servers currently enforce a min/max boundary on client session timeout relative to the ticktime setting, detailed here: http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions In general there are good reasons for this however in some cases, in particular with HBase region servers, we have seen a need to allow this bound to be set differently (higher). The Sun jvm can GC pause for very long times (in some cases we've seen 4 minutes even with the realtime gc. It would be good to allow this bound to be set via configuration parameters. Note: 4letterword and JMX integration would be needed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-635) Server supports listening on a specified network address
[ https://issues.apache.org/jira/browse/ZOOKEEPER-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-635: --- Status: Open (was: Patch Available) Server supports listening on a specified network address Key: ZOOKEEPER-635 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-635 Project: Zookeeper Issue Type: New Feature Components: server Affects Versions: 3.2.1 Reporter: Steve Chu Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-635.patch, ZOOKEEPER-635.patch, ZOOKEEPER-635.patch The issue in maililist is located: http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200912.mbox/%3c4ac0d28c0912210242g58230a9ds1c55361561c70...@mail.gmail.com%3e I have checked the server size code, seems no this option provided. This feature is useful when we have more than two network interfaces, one for Internet and others for intranet. We want to run ZooKeeper in our intranet and not be exposed to outside world. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-635) Server supports listening on a specified network address
[ https://issues.apache.org/jira/browse/ZOOKEEPER-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-635: --- Status: Patch Available (was: Open) Server supports listening on a specified network address Key: ZOOKEEPER-635 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-635 Project: Zookeeper Issue Type: New Feature Components: server Affects Versions: 3.2.1 Reporter: Steve Chu Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-635.patch, ZOOKEEPER-635.patch, ZOOKEEPER-635.patch The issue in maililist is located: http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200912.mbox/%3c4ac0d28c0912210242g58230a9ds1c55361561c70...@mail.gmail.com%3e I have checked the server size code, seems no this option provided. This feature is useful when we have more than two network interfaces, one for Internet and others for intranet. We want to run ZooKeeper in our intranet and not be exposed to outside world. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-640) make build.xml more configurable to ease packaging for linux distros
[ https://issues.apache.org/jira/browse/ZOOKEEPER-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12835998#action_12835998 ] Thomas Koch commented on ZOOKEEPER-640: --- Thanks for your afford. I'm sorry, that I didn't answer earlier. In the end, I've not used the build.xml at all but scripted the build in the debian/rules file. Please chancel this issue. make build.xml more configurable to ease packaging for linux distros Key: ZOOKEEPER-640 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-640 Project: Zookeeper Issue Type: Improvement Components: build Affects Versions: 3.2.1, 3.2.2 Reporter: Thomas Koch Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-640.patch Original Estimate: 0.25h Remaining Estimate: 0.25h Hi, I started packaging Zookeeper for Debian[1][2]. Thereby I had a problem excluding contrib/rest from the build without patching the upstream tarball. Could you please add some properties to your build.xml that allow me to (de)select contribs? In the example below I can easily override the properties: project name=zookeepercontrib property name=contribfilesetincludes value=*/build.xml / property name=contribfilesetexcludes value= / fileset id=contribfileset dir=. includes=${contribfilesetincludes} excludes=${contribfilesetexcludes} / target name=compile subant target=jar fileset refid=contribfileset / /subant /target Could you please also add a line to project.classpath: path id=project.classpath fileset dir=${additional.lib.dir} includes=*.jar/ For Debian I may not compile based on the jar files in lib but must use the jars already in Debian. [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561947 [2] http://git.debian.org/?p=pkg-java/zookeeper.git Thank you! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-640) make build.xml more configurable to ease packaging for linux distros
[ https://issues.apache.org/jira/browse/ZOOKEEPER-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836011#action_12836011 ] Patrick Hunt commented on ZOOKEEPER-640: No worries. Seems like a fine idea anyway, commiter plz review/commit asap. make build.xml more configurable to ease packaging for linux distros Key: ZOOKEEPER-640 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-640 Project: Zookeeper Issue Type: Improvement Components: build Affects Versions: 3.2.1, 3.2.2 Reporter: Thomas Koch Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-640.patch Original Estimate: 0.25h Remaining Estimate: 0.25h Hi, I started packaging Zookeeper for Debian[1][2]. Thereby I had a problem excluding contrib/rest from the build without patching the upstream tarball. Could you please add some properties to your build.xml that allow me to (de)select contribs? In the example below I can easily override the properties: project name=zookeepercontrib property name=contribfilesetincludes value=*/build.xml / property name=contribfilesetexcludes value= / fileset id=contribfileset dir=. includes=${contribfilesetincludes} excludes=${contribfilesetexcludes} / target name=compile subant target=jar fileset refid=contribfileset / /subant /target Could you please also add a line to project.classpath: path id=project.classpath fileset dir=${additional.lib.dir} includes=*.jar/ For Debian I may not compile based on the jar files in lib but must use the jars already in Debian. [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561947 [2] http://git.debian.org/?p=pkg-java/zookeeper.git Thank you! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-569) Failure of elected leader can lead to never-ending leader election
[ https://issues.apache.org/jira/browse/ZOOKEEPER-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836009#action_12836009 ] Henry Robinson commented on ZOOKEEPER-569: -- The most recent patch I submitted is the right patch - it includes Flavio's suggestions. Failure of elected leader can lead to never-ending leader election -- Key: ZOOKEEPER-569 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-569 Project: Zookeeper Issue Type: Bug Reporter: Henry Robinson Assignee: Henry Robinson Fix For: 3.3.0 Attachments: zookeeper-569.patch, ZOOKEEPER-569.patch, zookeeper-569.patch, zookeeper-569.patch, zookeeper-569.patch, zookeeper-569.patch It is possible for basic LeaderElection to enter a situation where it never terminates. As an example, consider a three node cluster A, B and C. 1. In the first round, A votes for A, B votes for B and C votes for C 2. Since C B A, all nodes resolve to vote for C in the second round as there is no first round winner 3. A, B vote for C, but C fails. 4. C is not elected because neither A nor B hear from it, and so votes for it are discarded 5. A and B never reset their votes, despite not hearing from C, so continue to vote for it ad infinitum. Step 5 is the bug. If A and B reset their votes to themselves in the case where the heard-from vote set is empty, leader election will continue. I do not know if this affects running ZK clusters, as it is possible that the out-of-band failure detection protocols may cause leader election to be restarted anyhow, but I've certainly seen this in tests. I have a trivial patch which fixes it, but it needs a test (and tests for race conditions are hard to write!) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-586) c client does not compile under cygwin
[ https://issues.apache.org/jira/browse/ZOOKEEPER-586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-586: --- Attachment: ZOOKEEPER-586.patch Updated patch to latest code - fixed problem with pid variable. c client does not compile under cygwin -- Key: ZOOKEEPER-586 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-586 Project: Zookeeper Issue Type: Bug Components: c client Affects Versions: 3.2.1 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-586.patch, ZOOKEEPER-586.patch, ZOOKEEPER-586.patch the c client fails to compile under cygwin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-586) c client does not compile under cygwin
[ https://issues.apache.org/jira/browse/ZOOKEEPER-586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-586: --- Status: Patch Available (was: Open) I'm submitting this so that we can get it into the code base. With this latest patch the client compiles under cygwin (1.5) however not all the tests run. I also tested the patch on ubuntu and it worked fine. I think we should commit as it allows the code to compile under cygwin, we can fix the cygwin tests later (also the issue of the new cygwin version, and what we will support going fwd). c client does not compile under cygwin -- Key: ZOOKEEPER-586 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-586 Project: Zookeeper Issue Type: Bug Components: c client Affects Versions: 3.2.1 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-586.patch, ZOOKEEPER-586.patch, ZOOKEEPER-586.patch the c client fails to compile under cygwin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-664) BookKeeper API documentation
[ https://issues.apache.org/jira/browse/ZOOKEEPER-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836047#action_12836047 ] Hudson commented on ZOOKEEPER-664: -- Integrated in ZooKeeper-trunk #701 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/701/]) BookKeeper API documentation Key: ZOOKEEPER-664 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-664 Project: Zookeeper Issue Type: Improvement Components: contrib-bookkeeper Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Fix For: 3.3.0 Attachments: ZOOKEEPER-664.patch Review and improve BookKeeper API documentation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-674) c client tests fail on cygwin
c client tests fail on cygwin - Key: ZOOKEEPER-674 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-674 Project: Zookeeper Issue Type: Bug Components: c client Affects Versions: 3.2.2 Reporter: Patrick Hunt Fix For: 3.4.0 The c client compiles on cygwin 1.5 after ZOOKEEPER-586 is applied, however not all the tests pass. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-672) typo nits across documentation
[ https://issues.apache.org/jira/browse/ZOOKEEPER-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836045#action_12836045 ] Hudson commented on ZOOKEEPER-672: -- Integrated in ZooKeeper-trunk #701 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/701/]) typo nits across documentation --- Key: ZOOKEEPER-672 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-672 Project: Zookeeper Issue Type: Improvement Components: documentation Affects Versions: 3.2.2 Reporter: Kay Kay Assignee: Kay Kay Fix For: 3.3.0 Attachments: ZOOKEEPER-672.patch some typo nits across the documentation. relevant forrest.xml files fixed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-607) improve bookkeeper overview
[ https://issues.apache.org/jira/browse/ZOOKEEPER-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836046#action_12836046 ] Hudson commented on ZOOKEEPER-607: -- Integrated in ZooKeeper-trunk #701 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/701/]) improve bookkeeper overview --- Key: ZOOKEEPER-607 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-607 Project: Zookeeper Issue Type: Improvement Components: contrib-bookkeeper Reporter: Benjamin Reed Assignee: Flavio Paiva Junqueira Fix For: 3.3.0 Attachments: bk-overview.jpg, ZOOKEEPER-607.patch, ZOOKEEPER-607.patch, ZOOKEEPER-607.patch, ZOOKEEPER-607.patch, ZOOKEEPER-607.patch fix the overview section in the bookkeeper documentation to introduce the programmer/admin to bookkeeper before giving the details. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Hudson build is back to normal : ZooKeeper-trunk #701
See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/701/
[jira] Commented: (ZOOKEEPER-665) Add BookKeeper streaming documentation
[ https://issues.apache.org/jira/browse/ZOOKEEPER-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836050#action_12836050 ] Hudson commented on ZOOKEEPER-665: -- Integrated in ZooKeeper-trunk #701 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/701/]) Add BookKeeper streaming documentation --- Key: ZOOKEEPER-665 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-665 Project: Zookeeper Issue Type: Improvement Components: contrib-bookkeeper Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Fix For: 3.3.0 Attachments: ZOOKEEPER-665.patch, ZOOKEEPER-665.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-673) Fix observer documentation regarding leader election
[ https://issues.apache.org/jira/browse/ZOOKEEPER-673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836048#action_12836048 ] Hudson commented on ZOOKEEPER-673: -- Integrated in ZooKeeper-trunk #701 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/701/]) Fix observer documentation regarding leader election Key: ZOOKEEPER-673 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-673 Project: Zookeeper Issue Type: Bug Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Fix For: 3.3.0 Attachments: ZOOKEEPER-673.patch We just need to remove the first two paragraphs of Section 2. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-668) Close method in LedgerInputStream doesn't do anything
[ https://issues.apache.org/jira/browse/ZOOKEEPER-668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836049#action_12836049 ] Hudson commented on ZOOKEEPER-668: -- Integrated in ZooKeeper-trunk #701 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/701/]) Close method in LedgerInputStream doesn't do anything - Key: ZOOKEEPER-668 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-668 Project: Zookeeper Issue Type: Bug Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Fix For: 3.3.0 Attachments: ZOOKEEPER-668.patch, ZOOKEEPER-668.patch I think we should remove the close call in LedgerInputStream. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-635) Server supports listening on a specified network address
[ https://issues.apache.org/jira/browse/ZOOKEEPER-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Reed updated ZOOKEEPER-635: Hadoop Flags: [Reviewed] Server supports listening on a specified network address Key: ZOOKEEPER-635 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-635 Project: Zookeeper Issue Type: New Feature Components: server Affects Versions: 3.2.1 Reporter: Steve Chu Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-635.patch, ZOOKEEPER-635.patch, ZOOKEEPER-635.patch The issue in maililist is located: http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200912.mbox/%3c4ac0d28c0912210242g58230a9ds1c55361561c70...@mail.gmail.com%3e I have checked the server size code, seems no this option provided. This feature is useful when we have more than two network interfaces, one for Internet and others for intranet. We want to run ZooKeeper in our intranet and not be exposed to outside world. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Ruby client binding for ZooKeeper now available.
Are there plans to integrate this work with the Apache project? https://issues.apache.org/jira/browse/ZOOKEEPER-661. On Fri, Feb 19, 2010 at 10:23 AM, Patrick Hunt ph...@apache.org wrote: Sorry if this is a dup for those of you following me on twitter ( http://twitter.com/phunt) but I wanted to let you know that twitter (the company) has contributed a Ruby client binding for ZooKeeper. You can learn more about the gem here: http://bit.ly/b9VB6k Regards, Patrick
[jira] Commented: (ZOOKEEPER-661) Add Ruby bindings
[ https://issues.apache.org/jira/browse/ZOOKEEPER-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836061#action_12836061 ] Jeff Hammerbacher commented on ZOOKEEPER-661: - How does this work compare to http://github.com/emaland/zookeeper_client/? Add Ruby bindings - Key: ZOOKEEPER-661 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-661 Project: Zookeeper Issue Type: New Feature Components: contrib-bindings Environment: MRI Ruby 1.9 JRuby 1.4 Reporter: Andrew Reynhout Priority: Minor Add Ruby bindings to the ZooKeeper distribution. Ruby presents special threading difficulties for asynchronous ZK calls (aget, watchers, etc). It looks like the simplest workaround is to patch the ZK C API. Proposed approach will be described in comment. Please use this ticket for discussion and suggestions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (ZOOKEEPER-603) zkpython should do a better job of freeing memory under error conditions
[ https://issues.apache.org/jira/browse/ZOOKEEPER-603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-603: -- Assignee: Henry Robinson Assuming Henry will pick this one up. zkpython should do a better job of freeing memory under error conditions Key: ZOOKEEPER-603 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-603 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.2.1 Reporter: Henry Robinson Assignee: Henry Robinson Fix For: 3.3.0 The general pattern is that the construction of a collection might fail, but the module is not freeing the memory that it has already allocated. Exceptions that are raised during this process aren't always propagated back to the Python side either. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-485) need ops documentation that details supervision of ZK server processes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-485: --- Assignee: Patrick Hunt Status: Patch Available (was: Open) need ops documentation that details supervision of ZK server processes -- Key: ZOOKEEPER-485 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-485 Project: Zookeeper Issue Type: Bug Components: documentation, server Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-485.patch We need ops documentation detailing what to do if the ZK server VM fails - by fail I mean the jvm process exits/dies/crashes/etc... In general a supervisor process should be used to start/stop/restart/etc... the ZK server vm. Something like daemontools http://cr.yp.to/daemontools.html could be used, or more simply a wrapper script should monitor the status of the pid and restart if the jvm fails. It's up to the operator, if this is not done automatically then it will have to be done manually, by operator restarting the ZK server jvm The inherent behavior of ZK wrt to failures - ie that it automatically recovers as long as quorum is maintained - fits into this nicely. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-662) Too many CLOSE_WAIT socket state on a server
[ https://issues.apache.org/jira/browse/ZOOKEEPER-662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836065#action_12836065 ] Patrick Hunt commented on ZOOKEEPER-662: Ah, looking back at the code I see we do have a linger timeout of 2 seconds on those sockets... however it shouldn't result in so many waiting sockets (unless you are trying a large number of connections per second, which it doesn't seem to be the case here) Have you seen this happen again? Or just that one time? Too many CLOSE_WAIT socket state on a server Key: ZOOKEEPER-662 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-662 Project: Zookeeper Issue Type: Bug Components: quorum Affects Versions: 3.2.1 Environment: Linux 2.6.9 Reporter: Qian Ye Fix For: 3.3.0 Attachments: zookeeper.log.2010020105, zookeeper.log.2010020106 I have a zookeeper cluster with 5 servers, zookeeper version 3.2.1, here is the content in the configure file, zoo.cfg == # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=5 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=2 # the directory where the snapshot is stored. dataDir=./data/ # the port at which the clients will connect clientPort=8181 # zookeeper cluster list server.100=10.23.253.43:8887: server.101=10.23.150.29:8887: server.102=10.23.247.141:8887: server.200=10.65.20.68:8887: server.201=10.65.27.21:8887: = Before the problem happened, the server.200 was the leader. Yesterday morning, I found the there were many sockets with the state of CLOSE_WAIT on the clientPort (8181), the total was over about 120. Because of these CLOSE_WAIT, the server.200 could not accept more connections from the clients. The only thing I can do under this situation is restart the server.200, at about 2010-02-01 06:06:35. The related log is attached to the issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-579) zkpython needs more test coverage for ACL code paths
[ https://issues.apache.org/jira/browse/ZOOKEEPER-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836070#action_12836070 ] Hadoop QA commented on ZOOKEEPER-579: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12434938/zookeeper-579.patch against trunk revision 911716. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/66/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/66/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/66/console This message is automatically generated. zkpython needs more test coverage for ACL code paths Key: ZOOKEEPER-579 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-579 Project: Zookeeper Issue Type: Improvement Components: contrib-bindings Affects Versions: 3.2.1 Reporter: Henry Robinson Assignee: Henry Robinson Fix For: 3.3.0 Attachments: zookeeper-579.patch zkpython's tests don't do a good enough job of exercising the ACL code paths. A few new tests that confirm that setACL and friends are working correctly are needed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-543) Tests for ZooKeeper examples
[ https://issues.apache.org/jira/browse/ZOOKEEPER-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836076#action_12836076 ] Hadoop QA commented on ZOOKEEPER-543: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12431894/ZOOKEEPER-543.patch against trunk revision 912052. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/118/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/118/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/118/console This message is automatically generated. Tests for ZooKeeper examples Key: ZOOKEEPER-543 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-543 Project: Zookeeper Issue Type: New Feature Components: tests Affects Versions: 3.3.0 Reporter: Steven Cheng Assignee: Steven Cheng Priority: Minor Fix For: 3.3.0 Attachments: ZOOKEEPER-543.patch, ZOOKEEPER-543.patch, ZOOKEEPER-543.patch Initial attempt to create ZooKeeper tests based on the example code on the website. Current plan is to test features used in examples using ZooKeeper calls directly. Another approach would be to make more usable abstractions such as those in src/recipes and test those. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-524) DBSizeTest is not really testing anything
[ https://issues.apache.org/jira/browse/ZOOKEEPER-524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Reed updated ZOOKEEPER-524: Resolution: Fixed Status: Resolved (was: Patch Available) Committed revision 912052. DBSizeTest is not really testing anything - Key: ZOOKEEPER-524 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-524 Project: Zookeeper Issue Type: Bug Components: server, tests Reporter: Patrick Hunt Assignee: Benjamin Reed Priority: Minor Fix For: 3.3.0 DBSizeTest looks like it should be testing latency, but it doesn't seem to do it (assert is commented out). We need to decide if this test should be fixed, or just dropped. Also note: this test takes 40seconds on my system. Way too long. Perhaps async create operations should be used to populate the database. I also noticed that data size has a big impact on overall test time (1k vs 5 bytes is something like a 2x time diff for time to run the test). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Ruby client binding for ZooKeeper now available.
I'd be happy to contribute it. This is a fork of someone else's client, which I forward-ported and added much functionality to. It is slightly incomplete, I will be finishing it soon. The current client is MIT licensed. Is the ASF license compatible? If not I can ask the original author to relicense his code, or rewrite the client. Eric On Feb 19, 2010, at 4:27 PM, Jeff Hammerbacher wrote: Are there plans to integrate this work with the Apache project? https://issues.apache.org/jira/browse/ZOOKEEPER-661. On Fri, Feb 19, 2010 at 10:23 AM, Patrick Hunt ph...@apache.org wrote: Sorry if this is a dup for those of you following me on twitter ( http://twitter.com/phunt) but I wanted to let you know that twitter (the company) has contributed a Ruby client binding for ZooKeeper. You can learn more about the gem here: http://bit.ly/b9VB6k Regards, Patrick
[jira] Commented: (ZOOKEEPER-569) Failure of elected leader can lead to never-ending leader election
[ https://issues.apache.org/jira/browse/ZOOKEEPER-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836082#action_12836082 ] Hadoop QA commented on ZOOKEEPER-569: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12435629/zookeeper-569.patch against trunk revision 912052. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/68/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/68/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/68/console This message is automatically generated. Failure of elected leader can lead to never-ending leader election -- Key: ZOOKEEPER-569 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-569 Project: Zookeeper Issue Type: Bug Reporter: Henry Robinson Assignee: Henry Robinson Fix For: 3.3.0 Attachments: zookeeper-569.patch, ZOOKEEPER-569.patch, zookeeper-569.patch, zookeeper-569.patch, zookeeper-569.patch, zookeeper-569.patch It is possible for basic LeaderElection to enter a situation where it never terminates. As an example, consider a three node cluster A, B and C. 1. In the first round, A votes for A, B votes for B and C votes for C 2. Since C B A, all nodes resolve to vote for C in the second round as there is no first round winner 3. A, B vote for C, but C fails. 4. C is not elected because neither A nor B hear from it, and so votes for it are discarded 5. A and B never reset their votes, despite not hearing from C, so continue to vote for it ad infinitum. Step 5 is the bug. If A and B reset their votes to themselves in the case where the heard-from vote set is empty, leader election will continue. I do not know if this affects running ZK clusters, as it is possible that the out-of-band failure detection protocols may cause leader election to be restarted anyhow, but I've certainly seen this in tests. I have a trivial patch which fixes it, but it needs a test (and tests for race conditions are hard to write!) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-635) Server supports listening on a specified network address
[ https://issues.apache.org/jira/browse/ZOOKEEPER-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836084#action_12836084 ] Hadoop QA commented on ZOOKEEPER-635: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12435665/ZOOKEEPER-635.patch against trunk revision 912052. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 44 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/119/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/119/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/119/console This message is automatically generated. Server supports listening on a specified network address Key: ZOOKEEPER-635 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-635 Project: Zookeeper Issue Type: New Feature Components: server Affects Versions: 3.2.1 Reporter: Steve Chu Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-635.patch, ZOOKEEPER-635.patch, ZOOKEEPER-635.patch The issue in maililist is located: http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200912.mbox/%3c4ac0d28c0912210242g58230a9ds1c55361561c70...@mail.gmail.com%3e I have checked the server size code, seems no this option provided. This feature is useful when we have more than two network interfaces, one for Internet and others for intranet. We want to run ZooKeeper in our intranet and not be exposed to outside world. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-586) c client does not compile under cygwin
[ https://issues.apache.org/jira/browse/ZOOKEEPER-586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836094#action_12836094 ] Hadoop QA commented on ZOOKEEPER-586: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12436397/ZOOKEEPER-586.patch against trunk revision 912052. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 26 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/73/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/73/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/73/console This message is automatically generated. c client does not compile under cygwin -- Key: ZOOKEEPER-586 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-586 Project: Zookeeper Issue Type: Bug Components: c client Affects Versions: 3.2.1 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-586.patch, ZOOKEEPER-586.patch, ZOOKEEPER-586.patch the c client fails to compile under cygwin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-669) watchedevent tostring should clearly output the state/type/path
[ https://issues.apache.org/jira/browse/ZOOKEEPER-669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836095#action_12836095 ] Hadoop QA commented on ZOOKEEPER-669: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12435751/ZOOKEEPER-669.patch against trunk revision 912052. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/121/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/121/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/121/console This message is automatically generated. watchedevent tostring should clearly output the state/type/path --- Key: ZOOKEEPER-669 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-669 Project: Zookeeper Issue Type: Bug Affects Versions: 3.1.2, 3.2.2 Reporter: Patrick Hunt Assignee: Patrick Hunt Priority: Critical Fix For: 3.3.0 Attachments: ZOOKEEPER-669.patch the current tostring method is broken -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-485) need ops documentation that details supervision of ZK server processes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836099#action_12836099 ] Hadoop QA commented on ZOOKEEPER-485: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12436408/ZOOKEEPER-485.patch against trunk revision 912052. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/74/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/74/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/74/console This message is automatically generated. need ops documentation that details supervision of ZK server processes -- Key: ZOOKEEPER-485 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-485 Project: Zookeeper Issue Type: Bug Components: documentation, server Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-485.patch We need ops documentation detailing what to do if the ZK server VM fails - by fail I mean the jvm process exits/dies/crashes/etc... In general a supervisor process should be used to start/stop/restart/etc... the ZK server vm. Something like daemontools http://cr.yp.to/daemontools.html could be used, or more simply a wrapper script should monitor the status of the pid and restart if the jvm fails. It's up to the operator, if this is not done automatically then it will have to be done manually, by operator restarting the ZK server jvm The inherent behavior of ZK wrt to failures - ie that it automatically recovers as long as quorum is maintained - fits into this nicely. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.