ZooKeeper ensemble configuration generator
This is currently more of a developer tool but I thought it might be useful for users as well -- a basic ZooKeeper ensemble configuration generator that takes some of the drudge work out of generating configs. I got sick of creating these by hand for the various setups I have (esp when experimenting) so I decided to build upon an existing templating system. (python/cheetah). It's up to github, feel free to check it out and fork/patches/comments/etc... http://github.com/phunt/zkconf/tree/master Patrick
Re: BUILDS ARE BACK NORMAL
HI all, As giri mentioned, the builds are back to normal and so is the patch process. http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/Zookeeper-Patch-ves ta.apache.org/ The patches are being run against hudson, so you DO NOT need to cancel and resubmit patches. Thanks mahadev On 8/5/09 9:50 PM, "Giridharan Kesavan" wrote: > Restarted all the build jobs on hudson; Builds are running fine. > Build failures are due to " /tmp: File system full, swap space limit exceeded > " > > Thanks, > -Giri > >> -Original Message- >> From: Giridharan Kesavan [mailto:gkesa...@yahoo-inc.com] >> Sent: Thursday, August 06, 2009 9:16 AM >> To: mapreduce-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; >> common-...@hadoop.apache.org; pig-...@hadoop.apache.org; zookeeper- >> d...@hadoop.apache.org >> Subject: build failures on hudson zones >> >> Build on hudson.zones are failing as the zonestorage for hudson is >> full. >> I 've sent an email to the ASF infra team about the space issues on >> hudson zones. >> >> Once the issues is resolved I would restart hudson for builds. >> >> Thanks, >> Giri >> > >
[jira] Updated: (ZOOKEEPER-483) ZK fataled on me, and ugly
[ https://issues.apache.org/jira/browse/ZOOKEEPER-483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Reed updated ZOOKEEPER-483: Status: Patch Available (was: Open) > ZK fataled on me, and ugly > -- > > Key: ZOOKEEPER-483 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-483 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: ryan rawson >Assignee: Benjamin Reed > Fix For: 3.2.1, 3.3.0 > > Attachments: zklogs.tar.gz, ZOOKEEPER-483.patch, ZOOKEEPER-483.patch > > > here are the part of the log whereby my zookeeper instance crashed, taking 3 > out of 5 down, and thus ruining the quorum for all clients: > 2009-07-23 12:29:06,769 WARN org.apache.zookeeper.server.NIOServerCnxn: > Exception causing close of session 0x52276d1d5161350 due to > java.io.IOException: Read error > 2009-07-23 12:29:00,756 WARN org.apache.zookeeper.server.quorum.Follower: > Exception when following the leader > java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) > at > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:65) > at > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108) > at > org.apache.zookeeper.server.quorum.Follower.readPacket(Follower.java:114) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:243) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:494) > 2009-07-23 12:29:06,770 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x52276d1d5161350 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.168:39489] > 2009-07-23 12:29:06,770 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x12276d15dfb0578 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.159:46797] > 2009-07-23 12:29:06,771 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x42276d1d3fa013e NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.153:33998] > 2009-07-23 12:29:06,771 WARN org.apache.zookeeper.server.NIOServerCnxn: > Exception causing close of session 0x52276d1d5160593 due to > java.io.IOException: Read error > 2009-07-23 12:29:06,808 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x32276d15d2e02bb NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.158:53758] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x42276d1d3fa13e4 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.154:58681] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x22276d15e691382 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.162:59967] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x12276d15dfb1354 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.163:49957] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x42276d1d3fa13cd NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.150:34212] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x22276d15e691383 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.159:46813] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x12276d15dfb0350 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.162:59956] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x32276d15d2e139b NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.156:55138] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x32276d15d2e1398 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.167:41257] > 2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x52276d1d5161355 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.153:34032] > 2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x52276d1d516
[jira] Commented: (ZOOKEEPER-483) ZK fataled on me, and ugly
[ https://issues.apache.org/jira/browse/ZOOKEEPER-483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739898#action_12739898 ] Benjamin Reed commented on ZOOKEEPER-483: - I've addressed 1) in the attached patch. for 2) we are not eating the IOException. we are actually shutting things down. the bug is actually that we are passing it up to the upper layer, which does not know anything about the follower thread. we need to handle it here. > ZK fataled on me, and ugly > -- > > Key: ZOOKEEPER-483 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-483 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: ryan rawson >Assignee: Benjamin Reed > Fix For: 3.2.1, 3.3.0 > > Attachments: zklogs.tar.gz, ZOOKEEPER-483.patch, ZOOKEEPER-483.patch > > > here are the part of the log whereby my zookeeper instance crashed, taking 3 > out of 5 down, and thus ruining the quorum for all clients: > 2009-07-23 12:29:06,769 WARN org.apache.zookeeper.server.NIOServerCnxn: > Exception causing close of session 0x52276d1d5161350 due to > java.io.IOException: Read error > 2009-07-23 12:29:00,756 WARN org.apache.zookeeper.server.quorum.Follower: > Exception when following the leader > java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) > at > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:65) > at > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108) > at > org.apache.zookeeper.server.quorum.Follower.readPacket(Follower.java:114) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:243) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:494) > 2009-07-23 12:29:06,770 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x52276d1d5161350 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.168:39489] > 2009-07-23 12:29:06,770 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x12276d15dfb0578 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.159:46797] > 2009-07-23 12:29:06,771 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x42276d1d3fa013e NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.153:33998] > 2009-07-23 12:29:06,771 WARN org.apache.zookeeper.server.NIOServerCnxn: > Exception causing close of session 0x52276d1d5160593 due to > java.io.IOException: Read error > 2009-07-23 12:29:06,808 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x32276d15d2e02bb NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.158:53758] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x42276d1d3fa13e4 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.154:58681] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x22276d15e691382 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.162:59967] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x12276d15dfb1354 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.163:49957] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x42276d1d3fa13cd NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.150:34212] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x22276d15e691383 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.159:46813] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x12276d15dfb0350 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.162:59956] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x32276d15d2e139b NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.156:55138] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x32276d15d2e1398 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.167:41257] > 2009-07-23 12:29:06,810 INFO org.apache.zoo
[jira] Updated: (ZOOKEEPER-483) ZK fataled on me, and ugly
[ https://issues.apache.org/jira/browse/ZOOKEEPER-483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Reed updated ZOOKEEPER-483: Attachment: ZOOKEEPER-483.patch > ZK fataled on me, and ugly > -- > > Key: ZOOKEEPER-483 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-483 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: ryan rawson >Assignee: Benjamin Reed > Fix For: 3.2.1, 3.3.0 > > Attachments: zklogs.tar.gz, ZOOKEEPER-483.patch, ZOOKEEPER-483.patch > > > here are the part of the log whereby my zookeeper instance crashed, taking 3 > out of 5 down, and thus ruining the quorum for all clients: > 2009-07-23 12:29:06,769 WARN org.apache.zookeeper.server.NIOServerCnxn: > Exception causing close of session 0x52276d1d5161350 due to > java.io.IOException: Read error > 2009-07-23 12:29:00,756 WARN org.apache.zookeeper.server.quorum.Follower: > Exception when following the leader > java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) > at > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:65) > at > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108) > at > org.apache.zookeeper.server.quorum.Follower.readPacket(Follower.java:114) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:243) > at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:494) > 2009-07-23 12:29:06,770 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x52276d1d5161350 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.168:39489] > 2009-07-23 12:29:06,770 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x12276d15dfb0578 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.159:46797] > 2009-07-23 12:29:06,771 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x42276d1d3fa013e NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.153:33998] > 2009-07-23 12:29:06,771 WARN org.apache.zookeeper.server.NIOServerCnxn: > Exception causing close of session 0x52276d1d5160593 due to > java.io.IOException: Read error > 2009-07-23 12:29:06,808 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x32276d15d2e02bb NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.158:53758] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x42276d1d3fa13e4 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.154:58681] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x22276d15e691382 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.162:59967] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x12276d15dfb1354 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.163:49957] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x42276d1d3fa13cd NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.150:34212] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x22276d15e691383 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.159:46813] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x12276d15dfb0350 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.162:59956] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x32276d15d2e139b NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.156:55138] > 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x32276d15d2e1398 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.167:41257] > 2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x52276d1d5161355 NIOServerCnxn: > java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 > remote=/10.20.20.153:34032] > 2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: > closing session:0x52276d1d516011c
BUILDS ARE BACK NORMAL
Restarted all the build jobs on hudson; Builds are running fine. Build failures are due to " /tmp: File system full, swap space limit exceeded " Thanks, -Giri > -Original Message- > From: Giridharan Kesavan [mailto:gkesa...@yahoo-inc.com] > Sent: Thursday, August 06, 2009 9:16 AM > To: mapreduce-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; > common-...@hadoop.apache.org; pig-...@hadoop.apache.org; zookeeper- > d...@hadoop.apache.org > Subject: build failures on hudson zones > > Build on hudson.zones are failing as the zonestorage for hudson is > full. > I 've sent an email to the ASF infra team about the space issues on > hudson zones. > > Once the issues is resolved I would restart hudson for builds. > > Thanks, > Giri >
build failures on hudson zones
Build on hudson.zones are failing as the zonestorage for hudson is full. I 've sent an email to the ASF infra team about the space issues on hudson zones. Once the issues is resolved I would restart hudson for builds. Thanks, Giri
[jira] Updated: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration
[ https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-498: - Status: Patch Available (was: Open) > Unending Leader Elections : WAN configuration > - > > Key: ZOOKEEPER-498 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498 > Project: Zookeeper > Issue Type: Bug > Components: leaderElection >Affects Versions: 3.2.0 > Environment: Each machine: > CentOS 5.2 64-bit > 2GB ram > java version "1.6.0_13" > Java(TM) SE Runtime Environment (build 1.6.0_13-b03) > Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed > Network Topology: > DC : central data center > POD(N): remote data center > Zookeeper Topology: > Leaders may be elected only in DC (weight = 1) > Only followers are elected in PODS (weight = 0) >Reporter: Todd Greenwood-Geer >Assignee: Flavio Paiva Junqueira >Priority: Critical > Fix For: 3.2.1, 3.3.0 > > Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, > zk498-test.tar.gz, zoo.cfg, ZOOKEEPER-498.patch > > > In a WAN configuration, ZooKeeper is endlessly electing, terminating, and > re-electing a ZooKeeper leader. The WAN configuration involves two groups, a > central DC group of ZK servers that have a voting weight = 1, and a group of > servers in remote pods with a voting weight of 0. > What we expect to see is leaders elected only in the DC, and the pods to > contain only followers. What we are seeing is a continuous cycling of > leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended > patches (473, 479, 481, 491), and now release 3.2.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration
[ https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flavio Paiva Junqueira updated ZOOKEEPER-498: - Attachment: ZOOKEEPER-498.patch I have generated a patch for this issue. I verified that I didn't do the correct checks in ZOOKEEPER-491, so I try to fix it in this patch. I have also modified the test to fix the problem with the fail assertion, and I have inspected the logs to see if it is behaving as expected. I can see no problem at this time with this patch. If someone else is interested in checking it out, please do it. > Unending Leader Elections : WAN configuration > - > > Key: ZOOKEEPER-498 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498 > Project: Zookeeper > Issue Type: Bug > Components: leaderElection >Affects Versions: 3.2.0 > Environment: Each machine: > CentOS 5.2 64-bit > 2GB ram > java version "1.6.0_13" > Java(TM) SE Runtime Environment (build 1.6.0_13-b03) > Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed > Network Topology: > DC : central data center > POD(N): remote data center > Zookeeper Topology: > Leaders may be elected only in DC (weight = 1) > Only followers are elected in PODS (weight = 0) >Reporter: Todd Greenwood-Geer >Assignee: Flavio Paiva Junqueira >Priority: Critical > Fix For: 3.2.1, 3.3.0 > > Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, > zk498-test.tar.gz, zoo.cfg, ZOOKEEPER-498.patch > > > In a WAN configuration, ZooKeeper is endlessly electing, terminating, and > re-electing a ZooKeeper leader. The WAN configuration involves two groups, a > central DC group of ZK servers that have a voting weight = 1, and a group of > servers in remote pods with a voting weight of 0. > What we expect to see is leaders elected only in the DC, and the pods to > contain only followers. What we are seeing is a continuous cycling of > leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended > patches (473, 479, 481, 491), and now release 3.2.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration
[ https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739891#action_12739891 ] Flavio Paiva Junqueira commented on ZOOKEEPER-498: -- Pat, we have a description of how to configure in the "Cluster options" of the Administrator guide. We are missing an example, which is in the source code as you point out. > Unending Leader Elections : WAN configuration > - > > Key: ZOOKEEPER-498 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498 > Project: Zookeeper > Issue Type: Bug > Components: leaderElection >Affects Versions: 3.2.0 > Environment: Each machine: > CentOS 5.2 64-bit > 2GB ram > java version "1.6.0_13" > Java(TM) SE Runtime Environment (build 1.6.0_13-b03) > Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed > Network Topology: > DC : central data center > POD(N): remote data center > Zookeeper Topology: > Leaders may be elected only in DC (weight = 1) > Only followers are elected in PODS (weight = 0) >Reporter: Todd Greenwood-Geer >Assignee: Flavio Paiva Junqueira >Priority: Critical > Fix For: 3.2.1, 3.3.0 > > Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, > zk498-test.tar.gz, zoo.cfg > > > In a WAN configuration, ZooKeeper is endlessly electing, terminating, and > re-electing a ZooKeeper leader. The WAN configuration involves two groups, a > central DC group of ZK servers that have a voting weight = 1, and a group of > servers in remote pods with a voting weight of 0. > What we expect to see is leaders elected only in the DC, and the pods to > contain only followers. What we are seeing is a continuous cycling of > leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended > patches (473, 479, 481, 491), and now release 3.2.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-490) the java docs for session creation are misleading/incomplete
[ https://issues.apache.org/jira/browse/ZOOKEEPER-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-490: --- Status: Patch Available (was: Open) > the java docs for session creation are misleading/incomplete > > > Key: ZOOKEEPER-490 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-490 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.2.0, 3.1.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt > Fix For: 3.2.1, 3.3.0 > > Attachments: ZOOKEEPER-490.patch > > > the javadoc for ZooKeeper constructor says: > * The client object will pick an arbitrary server and try to connect to > it. > * If failed, it will try the next one in the list, until a connection is > * established, or all the servers have been tried. > the "or all server tried" phrase is misleading, it should indicate that we > retry until success, con closed, or session expired. > we also need ot mention that connection is async, that constructor returns > immed and you need to look for connection event in watcher -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-490) the java docs for session creation are misleading/incomplete
[ https://issues.apache.org/jira/browse/ZOOKEEPER-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-490: --- Attachment: ZOOKEEPER-490.patch this patch updates the javadoc for zk construction talks about async nature talks about thread safety > the java docs for session creation are misleading/incomplete > > > Key: ZOOKEEPER-490 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-490 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.1.1, 3.2.0 >Reporter: Patrick Hunt >Assignee: Patrick Hunt > Fix For: 3.2.1, 3.3.0 > > Attachments: ZOOKEEPER-490.patch > > > the javadoc for ZooKeeper constructor says: > * The client object will pick an arbitrary server and try to connect to > it. > * If failed, it will try the next one in the list, until a connection is > * established, or all the servers have been tried. > the "or all server tried" phrase is misleading, it should indicate that we > retry until success, con closed, or session expired. > we also need ot mention that connection is async, that constructor returns > immed and you need to look for connection event in watcher -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-500) Async methods shouldnt throw exceptions
Async methods shouldnt throw exceptions --- Key: ZOOKEEPER-500 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-500 Project: Zookeeper Issue Type: Improvement Components: contrib-bookkeeper Reporter: Utkarsh Srivastava Async methods like asyncLedgerCreate and Open shouldnt be throwing InterruptedException and BKExceptions. The present method signatures lead to messy application code since one is forced to have error handling code in 2 places: inside the callback to handler a non-OK return code, and outside for handling the exceptions thrown by the call. There should be only one way to indicate error conditions, and that should be through a non-ok return code to the callback. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration
[ https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739796#action_12739796 ] Patrick Hunt commented on ZOOKEEPER-498: there are docs in the source code that provide good low level detail on flex quorum implementation HOWEVER, there are NO docs in the Ops guide detailing user level flex quorum operation we need to add docs (as part of this fix) to forrest detailing how to operate/troubleshoot/debug flex quorum > Unending Leader Elections : WAN configuration > - > > Key: ZOOKEEPER-498 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498 > Project: Zookeeper > Issue Type: Bug > Components: leaderElection >Affects Versions: 3.2.0 > Environment: Each machine: > CentOS 5.2 64-bit > 2GB ram > java version "1.6.0_13" > Java(TM) SE Runtime Environment (build 1.6.0_13-b03) > Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed > Network Topology: > DC : central data center > POD(N): remote data center > Zookeeper Topology: > Leaders may be elected only in DC (weight = 1) > Only followers are elected in PODS (weight = 0) >Reporter: Todd Greenwood-Geer >Assignee: Flavio Paiva Junqueira >Priority: Critical > Fix For: 3.2.1, 3.3.0 > > Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, > zk498-test.tar.gz, zoo.cfg > > > In a WAN configuration, ZooKeeper is endlessly electing, terminating, and > re-electing a ZooKeeper leader. The WAN configuration involves two groups, a > central DC group of ZK servers that have a voting weight = 1, and a group of > servers in remote pods with a voting weight of 0. > What we expect to see is leaders elected only in the DC, and the pods to > contain only followers. What we are seeing is a continuous cycling of > leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended > patches (473, 479, 481, 491), and now release 3.2.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration
[ https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739789#action_12739789 ] Patrick Hunt commented on ZOOKEEPER-498: Todd,I did see an issue with your config, it's not: group.1:1:2:3 rather it's: group.1=1:2:3 (should be = not : ) Regardless though - even after I fix this it's still not forming a cluster properly, we're still looking. > Unending Leader Elections : WAN configuration > - > > Key: ZOOKEEPER-498 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498 > Project: Zookeeper > Issue Type: Bug > Components: leaderElection >Affects Versions: 3.2.0 > Environment: Each machine: > CentOS 5.2 64-bit > 2GB ram > java version "1.6.0_13" > Java(TM) SE Runtime Environment (build 1.6.0_13-b03) > Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed > Network Topology: > DC : central data center > POD(N): remote data center > Zookeeper Topology: > Leaders may be elected only in DC (weight = 1) > Only followers are elected in PODS (weight = 0) >Reporter: Todd Greenwood-Geer >Assignee: Flavio Paiva Junqueira >Priority: Critical > Fix For: 3.2.1, 3.3.0 > > Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, > zk498-test.tar.gz, zoo.cfg > > > In a WAN configuration, ZooKeeper is endlessly electing, terminating, and > re-electing a ZooKeeper leader. The WAN configuration involves two groups, a > central DC group of ZK servers that have a voting weight = 1, and a group of > servers in remote pods with a voting weight of 0. > What we expect to see is leaders elected only in the DC, and the pods to > contain only followers. What we are seeing is a continuous cycling of > leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended > patches (473, 479, 481, 491), and now release 3.2.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration
[ https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739787#action_12739787 ] Patrick Hunt commented on ZOOKEEPER-498: Please fix the following as well - incorrect logging levels are being used in quorum code, example: 2009-08-05 15:17:02,733 - ERROR [WorkerSender Thread:quorumcnxmana...@341] - There is a connection for server 1 2009-08-05 15:17:02,753 - ERROR [WorkerSender Thread:quorumcnxmana...@341] - There is a connection for server 2 this is INFO, not ERROR > Unending Leader Elections : WAN configuration > - > > Key: ZOOKEEPER-498 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498 > Project: Zookeeper > Issue Type: Bug > Components: leaderElection >Affects Versions: 3.2.0 > Environment: Each machine: > CentOS 5.2 64-bit > 2GB ram > java version "1.6.0_13" > Java(TM) SE Runtime Environment (build 1.6.0_13-b03) > Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed > Network Topology: > DC : central data center > POD(N): remote data center > Zookeeper Topology: > Leaders may be elected only in DC (weight = 1) > Only followers are elected in PODS (weight = 0) >Reporter: Todd Greenwood-Geer >Assignee: Flavio Paiva Junqueira >Priority: Critical > Fix For: 3.2.1, 3.3.0 > > Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, > zk498-test.tar.gz, zoo.cfg > > > In a WAN configuration, ZooKeeper is endlessly electing, terminating, and > re-electing a ZooKeeper leader. The WAN configuration involves two groups, a > central DC group of ZK servers that have a voting weight = 1, and a group of > servers in remote pods with a voting weight of 0. > What we expect to see is leaders elected only in the DC, and the pods to > contain only followers. What we are seeing is a continuous cycling of > leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended > patches (473, 479, 481, 491), and now release 3.2.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration
[ https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-498: -- Assignee: Flavio Paiva Junqueira (was: Patrick Hunt) > Unending Leader Elections : WAN configuration > - > > Key: ZOOKEEPER-498 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498 > Project: Zookeeper > Issue Type: Bug > Components: leaderElection >Affects Versions: 3.2.0 > Environment: Each machine: > CentOS 5.2 64-bit > 2GB ram > java version "1.6.0_13" > Java(TM) SE Runtime Environment (build 1.6.0_13-b03) > Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed > Network Topology: > DC : central data center > POD(N): remote data center > Zookeeper Topology: > Leaders may be elected only in DC (weight = 1) > Only followers are elected in PODS (weight = 0) >Reporter: Todd Greenwood-Geer >Assignee: Flavio Paiva Junqueira >Priority: Critical > Fix For: 3.2.1, 3.3.0 > > Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, > zk498-test.tar.gz, zoo.cfg > > > In a WAN configuration, ZooKeeper is endlessly electing, terminating, and > re-electing a ZooKeeper leader. The WAN configuration involves two groups, a > central DC group of ZK servers that have a voting weight = 1, and a group of > servers in remote pods with a voting weight of 0. > What we expect to see is leaders elected only in the DC, and the pods to > contain only followers. What we are seeing is a continuous cycling of > leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended > patches (473, 479, 481, 491), and now release 3.2.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-499) electionAlg should default to FLE (3) - regression
[ https://issues.apache.org/jira/browse/ZOOKEEPER-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-499: --- Status: Patch Available (was: Open) > electionAlg should default to FLE (3) - regression > -- > > Key: ZOOKEEPER-499 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-499 > Project: Zookeeper > Issue Type: Bug > Components: server, tests >Affects Versions: 3.2.0 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Blocker > Fix For: 3.2.1, 3.3.0 > > Attachments: ZOOKEEPER-499.patch, ZOOKEEPER-499_br3.2.patch > > > there's a regression in 3.2 - electionAlg is no longer defaulting to 3 > (incorrectly defaults to 0) > also - need to have tests to validate this -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-499) electionAlg should default to FLE (3) - regression
[ https://issues.apache.org/jira/browse/ZOOKEEPER-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-499: --- Attachment: ZOOKEEPER-499_br3.2.patch ZOOKEEPER-499.patch patches to fix on trunk and branch (br3.2 is the branch patch) this fixes the problem - electionAlg again defaults to 3 it also adds a test to verify fle is used by default it also fixes a test that fails if fle is used (vs algo 0) which is due to a difference in the way jdk exposes unresolved host names when using udp vs tcp. > electionAlg should default to FLE (3) - regression > -- > > Key: ZOOKEEPER-499 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-499 > Project: Zookeeper > Issue Type: Bug > Components: server, tests >Affects Versions: 3.2.0 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Blocker > Fix For: 3.2.1, 3.3.0 > > Attachments: ZOOKEEPER-499.patch, ZOOKEEPER-499_br3.2.patch > > > there's a regression in 3.2 - electionAlg is no longer defaulting to 3 > (incorrectly defaults to 0) > also - need to have tests to validate this -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: Optimized WAN ZooKeeper Config : Multi-Ensemble configuration
Mahadev, comments inline: > -Original Message- > From: Mahadev Konar [mailto:maha...@yahoo-inc.com] > Sent: Wednesday, August 05, 2009 1:47 PM > To: zookeeper-dev@hadoop.apache.org > Subject: Re: Optimized WAN ZooKeeper Config : Multi-Ensemble configuration > > Todd, > Comments in line: > > > On 8/5/09 12:10 PM, "Todd Greenwood" wrote: > > > Flavio/Patrick/Mahadev - > > > > Thanks for your support to date. As I understand it, the sticky points > > w/ respect to WAN deployments are: > > > > 1. Leader Election: > > > > Leader elections in the WAN config (pod zk server weight = 0) is a bit > > troublesome (ZOOKEEPER-498) > Yes, until ZOOKEEPER-498 is fixed, you wont be able to use it with groups > and zero weight. > > > > > 2. Network Connectivity Required: > > > > ZooKeeper clients cannot read/write to ZK Servers if the Server does not > > have network connectivity to the quorum. In short, there is a hard > > requirement to have network connectivity in order for the clients to > > access the shared memory graph in ZK. > Yes > > > > > Alternative > > --- > > > > I have seen some discussion about in the past re: multi-ensemble > > solutions. Essentially, put one ensemble in each physical location > > (POD), and another in your DC, and have a fairly simple process > > coordinate synchronizing the various ensembles. If the POD writes can be > > confined to a sub-tree in the master graph, then this should be fairly > > simple. I'm imagining the following: > > > > DC (master) graph: > > /root/pods/1/data/item1 > > /root/pods/1/data/item2 > > /root/pods/1/data/item3 > > /root/pods/2 > > /root/pods/3 > > ...etc > > /root/shared/allpods/readonly/data/item1 > > /root/shared/allpods/readonly/data/item2 > > ...etc > > > > This has the advantage of minimizing cross pod traffic, which could be a > > real perf killer in an WAN. It also provides transacted writes in the > > PODs, even in the disconnected state. Clearly, another portion of the > > business logic has to reconcile the DC (master) graph such that each of > > the pods data items are processed, etc. > > > > Does anyone have any experience with this (pitfalls, suggestions, etc.?) > As far as I understand is that you mean that have a master Cluster with > other in a different data center syncing with the master (just a subtree)? > Is that correct? > > If yes, this is what one of our users in Yahoo! Search do. They have a > master cluster and a smaller cluster in a different datacenter and a > brdige > that copies data from the master cluster (only a subtree) to the smaller > one > and keeps them in syncs. > Yes, this is exactly what I'm proposing. With the addition that I'll sync subtrees in both directions, and have a separate process reconcile data from the various pods, like so: #pod1 ensemble /root/a/b #pod2 ensemble /root/a/b #dc ensemble /root/shared/foo/bar # Mapping (modeled after perforce client config) # [ensemble]:[path] [ensemble]:[path] # sync pods to dc [POD1]:/root/... [DC]:/root/pods/POD1/... [POD2]:/root/... [DC]:/root/pods/POD2/... # sync dc to pods [DC]:/root/shared/... [POD1]:/shared/... [DC]:/root/shared/... [POD2]:/shared/... [DC]:/root/shared/... [POD3]:/shared/... Now, for our needs, we'd like the DC data aggregated, so I'll have another process handle aggregating the pod specific data like so: POD Data Aggregator: aggregate data in [DC]:/root/pods/POD(N) to [DC]:/root/aggregated/data. This is just off the top of my head. -Todd > > Thanks > mahadev > > > > -Todd
Re: Optimized WAN ZooKeeper Config : Multi-Ensemble configuration
Todd, Comments in line: On 8/5/09 12:10 PM, "Todd Greenwood" wrote: > Flavio/Patrick/Mahadev - > > Thanks for your support to date. As I understand it, the sticky points > w/ respect to WAN deployments are: > > 1. Leader Election: > > Leader elections in the WAN config (pod zk server weight = 0) is a bit > troublesome (ZOOKEEPER-498) Yes, until ZOOKEEPER-498 is fixed, you wont be able to use it with groups and zero weight. > > 2. Network Connectivity Required: > > ZooKeeper clients cannot read/write to ZK Servers if the Server does not > have network connectivity to the quorum. In short, there is a hard > requirement to have network connectivity in order for the clients to > access the shared memory graph in ZK. Yes > > Alternative > --- > > I have seen some discussion about in the past re: multi-ensemble > solutions. Essentially, put one ensemble in each physical location > (POD), and another in your DC, and have a fairly simple process > coordinate synchronizing the various ensembles. If the POD writes can be > confined to a sub-tree in the master graph, then this should be fairly > simple. I'm imagining the following: > > DC (master) graph: > /root/pods/1/data/item1 > /root/pods/1/data/item2 > /root/pods/1/data/item3 > /root/pods/2 > /root/pods/3 > ...etc > /root/shared/allpods/readonly/data/item1 > /root/shared/allpods/readonly/data/item2 > ...etc > > This has the advantage of minimizing cross pod traffic, which could be a > real perf killer in an WAN. It also provides transacted writes in the > PODs, even in the disconnected state. Clearly, another portion of the > business logic has to reconcile the DC (master) graph such that each of > the pods data items are processed, etc. > > Does anyone have any experience with this (pitfalls, suggestions, etc.?) As far as I understand is that you mean that have a master Cluster with other in a different data center syncing with the master (just a subtree)? Is that correct? If yes, this is what one of our users in Yahoo! Search do. They have a master cluster and a smaller cluster in a different datacenter and a brdige that copies data from the master cluster (only a subtree) to the smaller one and keeps them in syncs. Thanks mahadev > > -Todd
[jira] Updated: (ZOOKEEPER-499) electionAlg should default to FLE (3) - regression
[ https://issues.apache.org/jira/browse/ZOOKEEPER-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-499: --- Release Note: workaround in 3.2.0 (this only effects 3.2.0) set electionAlg=3 in server config files. > electionAlg should default to FLE (3) - regression > -- > > Key: ZOOKEEPER-499 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-499 > Project: Zookeeper > Issue Type: Bug > Components: server, tests >Affects Versions: 3.2.0 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Blocker > Fix For: 3.2.1, 3.3.0 > > > there's a regression in 3.2 - electionAlg is no longer defaulting to 3 > (incorrectly defaults to 0) > also - need to have tests to validate this -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-462) Last hint for open ledger
[ https://issues.apache.org/jira/browse/ZOOKEEPER-462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated ZOOKEEPER-462: Fix Version/s: 3.3.0 > Last hint for open ledger > - > > Key: ZOOKEEPER-462 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-462 > Project: Zookeeper > Issue Type: New Feature > Components: contrib-bookkeeper >Reporter: Flavio Paiva Junqueira >Assignee: Flavio Paiva Junqueira > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-462.patch > > > In some use cases of BookKeeper, it is useful to be able to read from a > ledger before closing the ledger. To enable such a feature, the writer has to > be able to communicate to a reader how many entries it has been able to write > successfully. The main idea of this jira is to continuously update a znode > with the number of successful writes, and a reader can, for example, watch > the node for changes. > I was thinking of having a configuration parameter to state how often a > writer should update the hint on ZooKeeper (e.g., every 1000 requests, every > 10,000 requests). Clearly updating more often increases the overhead of > writing to ZooKeeper, although the impact on the performance of writes to > BookKeeper should be minimal given that we make an asynchronous call to > update the hint. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-499) electionAlg should default to FLE (3) - regression
electionAlg should default to FLE (3) - regression -- Key: ZOOKEEPER-499 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-499 Project: Zookeeper Issue Type: Bug Components: server, tests Affects Versions: 3.2.0 Reporter: Patrick Hunt Priority: Blocker Fix For: 3.2.1, 3.3.0 there's a regression in 3.2 - electionAlg is no longer defaulting to 3 (incorrectly defaults to 0) also - need to have tests to validate this -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (ZOOKEEPER-499) electionAlg should default to FLE (3) - regression
[ https://issues.apache.org/jira/browse/ZOOKEEPER-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-499: -- Assignee: Patrick Hunt > electionAlg should default to FLE (3) - regression > -- > > Key: ZOOKEEPER-499 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-499 > Project: Zookeeper > Issue Type: Bug > Components: server, tests >Affects Versions: 3.2.0 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Blocker > Fix For: 3.2.1, 3.3.0 > > > there's a regression in 3.2 - electionAlg is no longer defaulting to 3 > (incorrectly defaults to 0) > also - need to have tests to validate this -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Optimized WAN ZooKeeper Config : Multi-Ensemble configuration
Flavio/Patrick/Mahadev - Thanks for your support to date. As I understand it, the sticky points w/ respect to WAN deployments are: 1. Leader Election: Leader elections in the WAN config (pod zk server weight = 0) is a bit troublesome (ZOOKEEPER-498) 2. Network Connectivity Required: ZooKeeper clients cannot read/write to ZK Servers if the Server does not have network connectivity to the quorum. In short, there is a hard requirement to have network connectivity in order for the clients to access the shared memory graph in ZK. Alternative --- I have seen some discussion about in the past re: multi-ensemble solutions. Essentially, put one ensemble in each physical location (POD), and another in your DC, and have a fairly simple process coordinate synchronizing the various ensembles. If the POD writes can be confined to a sub-tree in the master graph, then this should be fairly simple. I'm imagining the following: DC (master) graph: /root/pods/1/data/item1 /root/pods/1/data/item2 /root/pods/1/data/item3 /root/pods/2 /root/pods/3 ...etc /root/shared/allpods/readonly/data/item1 /root/shared/allpods/readonly/data/item2 ...etc This has the advantage of minimizing cross pod traffic, which could be a real perf killer in an WAN. It also provides transacted writes in the PODs, even in the disconnected state. Clearly, another portion of the business logic has to reconcile the DC (master) graph such that each of the pods data items are processed, etc. Does anyone have any experience with this (pitfalls, suggestions, etc.?) -Todd
RE: Unending Leader Elections in WAN deploy
IT says yes, there are firewalls, but that yes, there is full connectivity between each of the zk servers. > -Original Message- > From: Mahadev Konar [mailto:maha...@yahoo-inc.com] > Sent: Tuesday, August 04, 2009 6:01 PM > To: zookeeper-dev@hadoop.apache.org > Subject: Re: Unending Leader Elections in WAN deploy > > Hi todd, > I see a lot of > > java.net.ConnectException: Connection refused > at sun.nio.ch.Net.connect(Native Method) > at > sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507) > at java.nio.channels.SocketChannel.open(SocketChannel.java:146) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnx Ma > na > ger.java:324) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxMana ge > r. > java:304) > at > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSe nd > er > .process(FastLeaderElection.java:317) > at > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSe nd > er > .run(FastLeaderElection.java:290) > at java.lang.Thread.run(Thread.java:619) > > > Is it possible that there is some firewall? Can all the servers 1-9 > connect > to all the others using ports that you specified in zoo.cfg i.e 2888/3888? > > > Thanks > mahadev > > > On 8/4/09 4:56 PM, "Todd Greenwood" wrote: > > > Looks like we're not getting *any* leader elected now Logs attached. > > > >> -Original Message- > >> From: Todd Greenwood [mailto:to...@audiencescience.com] > >> Sent: Tuesday, August 04, 2009 4:07 PM > >> To: zookeeper-dev@hadoop.apache.org > >> Subject: RE: Unending Leader Elections in WAN deploy > >> > >> Patrick, thanks! I'll forward on to IT and I'll report back to you > >> shortly... > >> > >>> -Original Message- > >>> From: Patrick Hunt [mailto:ph...@apache.org] > >>> Sent: Tuesday, August 04, 2009 3:55 PM > >>> To: zookeeper-dev@hadoop.apache.org > >>> Subject: Re: Unending Leader Elections in WAN deploy > >>> > >>> Todd, Mahadev and I looked at this and it turns out to be a > >> regression. > >>> Ironically a patch I created for 3.2 branch to add quorum tests > >> actually > >>> broke the quorum config -- a default value for a config parameter > > was > >>> lost. I'm going to submit a patch asap to get the default back, but > >> for > >>> the time being you can set: > >>> > >>> electionAlg=3 > >>> > >>> in each of your config files. > >>> > >>> You should see reference to FastLeaderElection in your log files if > >> this > >>> parameter is set correctly. > >>> > >>> Sorry for the trouble, > >>> > >>> Patrick > >>> > >>> Todd Greenwood wrote: > Mahadev, > > I just heard from IT that this build behaves in exactly the same > > way > >> as > previous versions, e.g. we get continuous leader elections that > disconnect the followers and then get re-elected, and > >> disconnect...etc. > > This is from a fresh sync to the 3.2 branch: > > svn co > > > http://svn.apache.org/repos/asf/hadoop/zookeeper/branches/branch-3.2 > ./branch-3.2 > > CHANGES.TXT show the various fixes included: > > > >> > > to...@toddg01lt:~/asi/workspaces/main/Main/RSI/etc/holmes/main/zookeeper > /src/original$ head -n 50 branch-3.2/CHANGES.txt > Release 3.2.1 > > Backward compatibile changes: > > BUGFIXES: > ZOOKEEPER-468. avoid compile warning in send_auth_info(). (chris > >> via > flavio) > > ZOOKEEPER-469. make sure CPPUNIT_CFLAGS isn't overwritten (chris > >> via > mahadev) > > ZOOKEEPER-471. update zkperl for 3.2.x branch. (chris via > > mahadev) > > ZOOKEEPER-470. include unistd.h for sleep() in c tests (chris > > via > mahadev) > > ZOOKEEPER-460. bad testRetry in cppunit tests (hudson failure) > (giri via mahadev) > > ZOOKEEPER-467. Change log level in BookieHandle (flavio via > >> mahadev) > > ZOOKEEPER-482. ignore sigpipe in testRetry to avoid silent > >> immediate > failure. (chris via mahadev) > > ZOOKEEPER-487. setdata on root (/) crashes the servers (mahadev > >> via > phunt) > > ZOOKEEPER-457. Make ZookeeperMain public, support for HBase (and > other) > embedded clients (ryan rawson via phunt) > > ZOOKEEPER-481. Add lastMessageSent to QuorumCnxManager. (flavio > >> via > mahadev) > > ZOOKEEPER-479. QuorumHierarchical does not count groups > > correctly > (flavio via mahadev) > > ZOOKEEPER-466. crash on zookeeper_close() when using auth with > >> empty > cert > (Chris Darroch via phunt) > > ZOOKEEPER-480. FLE should perform leader check when node is not > leading and > add vote of follower (flavio via mahadev) > > ZOOKEEPER-491. Prevent zero-weight servers from being elected > >> (fl
[jira] Updated: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration
[ https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-498: --- Attachment: zk498-test.tar.gz I attached zk498-test.tar.gz - this is a 5 server config (2 0weight) that fails to achieve quorum. run start.sh/stop.sh and checkout the individual logs for details. > Unending Leader Elections : WAN configuration > - > > Key: ZOOKEEPER-498 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498 > Project: Zookeeper > Issue Type: Bug > Components: leaderElection >Affects Versions: 3.2.0 > Environment: Each machine: > CentOS 5.2 64-bit > 2GB ram > java version "1.6.0_13" > Java(TM) SE Runtime Environment (build 1.6.0_13-b03) > Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed > Network Topology: > DC : central data center > POD(N): remote data center > Zookeeper Topology: > Leaders may be elected only in DC (weight = 1) > Only followers are elected in PODS (weight = 0) >Reporter: Todd Greenwood-Geer >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.2.1, 3.3.0 > > Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, > zk498-test.tar.gz, zoo.cfg > > > In a WAN configuration, ZooKeeper is endlessly electing, terminating, and > re-electing a ZooKeeper leader. The WAN configuration involves two groups, a > central DC group of ZK servers that have a voting weight = 1, and a group of > servers in remote pods with a voting weight of 0. > What we expect to see is leaders elected only in the DC, and the pods to > contain only followers. What we are seeing is a continuous cycling of > leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended > patches (473, 479, 481, 491), and now release 3.2.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration
[ https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739609#action_12739609 ] Patrick Hunt commented on ZOOKEEPER-498: Looks to me like 0 weight is still busted, fle0weighttest is actually failing on my machine, however it's reported as success: - Standard Error - Exception in thread "Thread-108" junit.framework.AssertionFailedError: Elected zero-weight server at junit.framework.Assert.fail(Assert.java:47) at org.apache.zookeeper.test.FLEZeroWeightTest$LEThread.run(FLEZeroWeightTest.java:138) - --- this is probably due because the test is calling assert in a thread other than the main test thread - which junit will not track/knowabout. One problem I see with these tests (0weight test I looked at) -- it doesn't have a client attempt to connect to the various servers as part of declaring success. Really we should only consider "success"ful test (ie assert that) if a client can connect to each server in the cluster and change/seechanges. As part of fixing this we really need to do a sanity check by testing the various command lines and checking that a client can connect. I'm not even sure FLEnewepochtest/fletest/etc... are passing either. new epoch seems to just thrash... Also I tried 3 & 5 server quorums "by hand from the command line" with 0 weight and they see similar issues to what Todd is seeing. this is happening for me on both the trunk and 3.2 branch source. > Unending Leader Elections : WAN configuration > - > > Key: ZOOKEEPER-498 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498 > Project: Zookeeper > Issue Type: Bug > Components: leaderElection >Affects Versions: 3.2.0 > Environment: Each machine: > CentOS 5.2 64-bit > 2GB ram > java version "1.6.0_13" > Java(TM) SE Runtime Environment (build 1.6.0_13-b03) > Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed > Network Topology: > DC : central data center > POD(N): remote data center > Zookeeper Topology: > Leaders may be elected only in DC (weight = 1) > Only followers are elected in PODS (weight = 0) >Reporter: Todd Greenwood-Geer >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.2.1, 3.3.0 > > Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, zoo.cfg > > > In a WAN configuration, ZooKeeper is endlessly electing, terminating, and > re-electing a ZooKeeper leader. The WAN configuration involves two groups, a > central DC group of ZK servers that have a voting weight = 1, and a group of > servers in remote pods with a voting weight of 0. > What we expect to see is leaders elected only in the DC, and the pods to > contain only followers. What we are seeing is a continuous cycling of > leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended > patches (473, 479, 481, 491), and now release 3.2.1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-484) Clients get SESSION MOVED exception when switching from follower to a leader.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739588#action_12739588 ] Hadoop QA commented on ZOOKEEPER-484: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12415556/ZOOKEEPER-484.patch against trunk revision 800990. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/167/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/167/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/167/console This message is automatically generated. > Clients get SESSION MOVED exception when switching from follower to a leader. > - > > Key: ZOOKEEPER-484 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-484 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.0 >Reporter: Mahadev konar >Assignee: Mahadev konar >Priority: Blocker > Fix For: 3.2.1, 3.3.0 > > Attachments: sessionTest.patch, ZOOKEEPER-484.patch > > > When a client is connected to follower and get disconnected and connects to a > leader it gets SESSION MOVED excpetion. This is beacuse of a bug in the new > feature of ZOOKEEPER-417 that we added in 3.2. All the releases before 3.2 DO > NOT have this problem. The fix is to make sure the ownership of a connection > gets changed when a session moves from follower to the leader. The workaround > to it in 3.2.0 would be to swithc off connection from clients to the leader. > take a look at *leaderServers* java property in > http://hadoop.apache.org/zookeeper/docs/r3.2.0/zookeeperAdmin.html. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: hudson patch build back to normal
Thanks Giri! Patrick Giridharan Kesavan wrote: If you have changed the jira status to patch available in the last couple of days please resubmit your patch for hudson to pick your patch for testing. -Giri -Original Message- From: Giridharan Kesavan [mailto:gkesa...@yahoo-inc.com] Sent: Wednesday, August 05, 2009 7:18 PM To: zookeeper-dev@hadoop.apache.org Cc: Nigel Daley Subject: hudson patch build back to normal Sendmail issues on hudson.zones is fixed now and patch build for zookeeper is restarted. Regards, Giri
RE: hudson patch build back to normal
If you have changed the jira status to patch available in the last couple of days please resubmit your patch for hudson to pick your patch for testing. -Giri > -Original Message- > From: Giridharan Kesavan [mailto:gkesa...@yahoo-inc.com] > Sent: Wednesday, August 05, 2009 7:18 PM > To: zookeeper-dev@hadoop.apache.org > Cc: Nigel Daley > Subject: hudson patch build back to normal > > Sendmail issues on hudson.zones is fixed now and patch build for > zookeeper is restarted. > > Regards, > Giri
hudson patch build back to normal
Sendmail issues on hudson.zones is fixed now and patch build for zookeeper is restarted. Regards, Giri
[jira] Updated: (ZOOKEEPER-484) Clients get SESSION MOVED exception when switching from follower to a leader.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated ZOOKEEPER-484: - Status: Patch Available (was: Open) > Clients get SESSION MOVED exception when switching from follower to a leader. > - > > Key: ZOOKEEPER-484 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-484 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.0 >Reporter: Mahadev konar >Assignee: Mahadev konar >Priority: Blocker > Fix For: 3.2.1, 3.3.0 > > Attachments: sessionTest.patch, ZOOKEEPER-484.patch > > > When a client is connected to follower and get disconnected and connects to a > leader it gets SESSION MOVED excpetion. This is beacuse of a bug in the new > feature of ZOOKEEPER-417 that we added in 3.2. All the releases before 3.2 DO > NOT have this problem. The fix is to make sure the ownership of a connection > gets changed when a session moves from follower to the leader. The workaround > to it in 3.2.0 would be to swithc off connection from clients to the leader. > take a look at *leaderServers* java property in > http://hadoop.apache.org/zookeeper/docs/r3.2.0/zookeeperAdmin.html. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-484) Clients get SESSION MOVED exception when switching from follower to a leader.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated ZOOKEEPER-484: - Status: Open (was: Patch Available) resubmitting the patch to the patch queue. > Clients get SESSION MOVED exception when switching from follower to a leader. > - > > Key: ZOOKEEPER-484 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-484 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.0 >Reporter: Mahadev konar >Assignee: Mahadev konar >Priority: Blocker > Fix For: 3.2.1, 3.3.0 > > Attachments: sessionTest.patch, ZOOKEEPER-484.patch > > > When a client is connected to follower and get disconnected and connects to a > leader it gets SESSION MOVED excpetion. This is beacuse of a bug in the new > feature of ZOOKEEPER-417 that we added in 3.2. All the releases before 3.2 DO > NOT have this problem. The fix is to make sure the ownership of a connection > gets changed when a session moves from follower to the leader. The workaround > to it in 3.2.0 would be to swithc off connection from clients to the leader. > take a look at *leaderServers* java property in > http://hadoop.apache.org/zookeeper/docs/r3.2.0/zookeeperAdmin.html. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-447) zkServer.sh doesn't allow different config files to be specified on the command line
[ https://issues.apache.org/jira/browse/ZOOKEEPER-447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739429#action_12739429 ] Hudson commented on ZOOKEEPER-447: -- Integrated in ZooKeeper-trunk #405 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/405/]) . zkServer.sh doesn't allow different config files to be specified on the command line > zkServer.sh doesn't allow different config files to be specified on the > command line > > > Key: ZOOKEEPER-447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-447 > Project: Zookeeper > Issue Type: Improvement >Affects Versions: 3.1.1, 3.2.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Fix For: 3.2.1, 3.3.0 > > Attachments: ZOOKEEPER-447.patch > > > Unless I'm missing something, you can change the directory that the zoo.cfg > file is in by setting ZOOCFGDIR but not the name of the file itself. > I find it convenient myself to specify the config file on the command line, > but we should also let it be specified by environment variable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-480) FLE should perform leader check when node is not leading and add vote of follower
[ https://issues.apache.org/jira/browse/ZOOKEEPER-480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739427#action_12739427 ] Hudson commented on ZOOKEEPER-480: -- Integrated in ZooKeeper-trunk #405 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/405/]) . FLE should perform leader check when node is not leading and add vote of follower (flavio via mahadev) > FLE should perform leader check when node is not leading and add vote of > follower > - > > Key: ZOOKEEPER-480 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-480 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Flavio Paiva Junqueira >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.1, 3.3.0 > > Attachments: ZOOKEEPER-480-3.2branch.patch, > ZOOKEEPER-480-3.2branch.patch, ZOOKEEPER-480.patch, ZOOKEEPER-480.patch, > ZOOKEEPER-480.patch, ZOOKEEPER-480.patch, ZOOKEEPER-480.patch > > > As a server may join leader election while others have already elected a > leader, it is necessary that a server handles some special cases of leader > election when notifications are from servers that are either LEADING or > FOLLOWING. In such special cases, we check if we have received a message from > the leader to declare a leader elected. This check does not consider the case > that the process performing the check might be a recently elected leader, and > consequently the check fails. > This patch also adds a new case, which corresponds to adding a vote to > recvset when the notification is from a process LEADING or FOLLOWING. This > fixes the case raised in ZOOKEEPER-475. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-493) patch for command line setquota
[ https://issues.apache.org/jira/browse/ZOOKEEPER-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739428#action_12739428 ] Hudson commented on ZOOKEEPER-493: -- Integrated in ZooKeeper-trunk #405 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/405/]) . patch for command line setquota > patch for command line setquota > > > Key: ZOOKEEPER-493 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-493 > Project: Zookeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.2.0 >Reporter: steve bendiola >Assignee: steve bendiola >Priority: Minor > Fix For: 3.2.1, 3.3.0 > > Attachments: quotafix.patch, ZOOKEEPER-493.patch > > > the command line "setquota" tries to use argument 3 as both a path and a value -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-491) Prevent zero-weight servers from being elected
[ https://issues.apache.org/jira/browse/ZOOKEEPER-491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739426#action_12739426 ] Hudson commented on ZOOKEEPER-491: -- Integrated in ZooKeeper-trunk #405 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/405/]) . Prevent zero-weight servers from being elected. (flavio via mahadev) > Prevent zero-weight servers from being elected > -- > > Key: ZOOKEEPER-491 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-491 > Project: Zookeeper > Issue Type: New Feature > Components: leaderElection >Affects Versions: 3.2.0 >Reporter: Flavio Paiva Junqueira >Assignee: Flavio Paiva Junqueira > Fix For: 3.2.1, 3.3.0 > > Attachments: ZOOKEEPER-491-3.2branch.patch, ZOOKEEPER-491.patch > > > This is a fix to prevent zero-weight servers from being elected leaders. This > will allow in wide-area scenarios to restrict the set of servers that can > lead the ensemble. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.