Re: Restarting a single zookeeper Server on the same port within the process
Siddharth Raghavan wrote: I need to restart a single zookeeper server node on the same port within my unit tests. Are you testing c or java client? I tried stopping the server, having a delay and restarting it on the same port. But the server doesn't startup. When I re-start on a different port, it starts up correctly. You may be running into this: http://hea-www.harvard.edu/~fine/Tech/addrinuse.html Take a look at the zk tests, we do this in some cases and it works for us so you should be able to do similar. See RecoveryTest.java or TestClient.cc for java and c respectively. Ring us back if you still have questions. Patrick
Re: zookeeper standalone can not start
You have a small typo in your client command, it should be: bin/zkCli.sh -server 10.16.50.132:2181 (a : not a . prior to the port) Patrick chengxiong000 wrote: Dear zookeepers: I am a zookeeper user and encount an problem when start zookeeper when start the server and client task . And the case is 1. download zookeeper 3.1.1.tar 2.configure the zookeeper as the turial create the conf/zoo.cfg: tickTime=2000 dataDir=/var/zookeeper clientPort=2181 3.start the server bin/zkServer.sh start The output is JMX enabled by default starting zookeeper ... started ..$ log4j:WARN No appenders could be found for logger(org.apache.log4j.jmx.HierarchyDynamicMBean). log4j:WARN Please initialize the log4j system properly. (and the cursor stop here!) 4.start the client bin/zkCli.sh -server 10.16.50.132.2181 log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeperMain). log4j: WARN Please initialize the log4j system properly. ZooKeeper host : port cmd args create path data ac1 delete path [version] .. delquotssa [-n|-b] path It seems that it is wrong and could anyone encount the same problem and how to solve the problem? Regards Charles
Re: Cluster Configuration Issues
You might try my ZooKeeper configuration generator if you have python handy: http://bit.ly/mBEcF The main issue that I see with your config is that each config file needs to contain a list of all the servers in the ensemble: ... syncLimit=2 server.1=host1... server.2=host2... server.3=host3... server.4=host4... where the myid file in the data dir for each hostX corresponds to it's server id (so myid=1 on host1, myid=2 on host2, etc...) Patrick Mark Vigeant wrote: Hey- So I'm trying to run hbase on 4 nodes, and in order to do that I need to run zookeeper in replicated mode (I could have hbase run the quorum for me, but it's suggested that I don't). I have an issue though. For some reason the id I'm assigning each server in the file myid in the assigned data directory is not getting read. I feel like another id is being created and put somewhere else. Does anyone have any tips on starting a zookeeper quorum? Do I create the myid file myself or do I edit one once it is created by zookeeper? This is what my config looks like: ticktime=2000 dataDir=/home/hadoop/zookeeper clientPort=2181 initLimit=5 syncLimit=2 server.1=hadoop1:2888:3888 The name of my machine is hadoop1, with user name hadoop. In /home/hadoop/zookeeper I've created a myid file with the number 1 in it. Mark Vigeant RiskMetrics Group, Inc.
FYI: third party PHP binding for ZooKeeper
FWIW I noticed this on twitter last night, a third party PHP binding for ZooKeeper is now available (I haven't tried it myself): http://twitter.com/phunt/status/4906002271 Patrick
Re: C client (via zkpython) returns unknown state
You're right, 0 should be something like INITIALIZING_STATE but it's not in zookeeper.h zookeeper_init(...) docs: * This method creates a new handle and a zookeeper session that corresponds * to that handle. Session establishment is asynchronous, meaning that the * session should not be considered established until (and unless) an * event of state ZOO_CONNECTED_STATE is received. Please enter a JIRA for this and we'll address it in the next release: https://issues.apache.org/jira/browse/ZOOKEEPER Thanks for the report! Patrick Steven Wong wrote: Using zkpython with ZK 3.2.1 release: import zookeeper as z zh = z.init(...) z.state(zh) # returns 3 == z.CONNECTED_STATE # kill standalone ZK server z.state(zh) # returns 0 == ??? The problem is that 0 is not a state defined by zookeeper.[ch]. I'm not sure whether 0 should've been defined or z.state should've returned something else. Steven
Re: C client (via zkpython) returns unknown state
There's no requirement currently that they be the same/similar. We try to keep them similar just to ease the learning curve for ppl that want to use both (also for the devs to stay sane). In a perfect world, probably. I think there's some divergence just due to the fact that java is OO and c is not (like polluting name spaces and such). Version 4 will come at some point, we had thought to make some non-bw compatible changes to the APIs in that release (like moving to long for some fields that are currently int's). Perhaps in that release we could address some of the more egregious examples. Patrick Steven Wong wrote: Java's KeeperState.Disconnected is 0, so probably that's what the C client should have. This brings up another question: Is the C client supposed to be in sync with the Java client? I notice that there are multiple differences between C's ZOO_*_STATE and Java's KeeperState. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Tuesday, October 13, 2009 5:03 PM To: zookeeper-user@hadoop.apache.org Subject: Re: C client (via zkpython) returns unknown state You're right, 0 should be something like INITIALIZING_STATE but it's not in zookeeper.h zookeeper_init(...) docs: * This method creates a new handle and a zookeeper session that corresponds * to that handle. Session establishment is asynchronous, meaning that the * session should not be considered established until (and unless) an * event of state ZOO_CONNECTED_STATE is received. Please enter a JIRA for this and we'll address it in the next release: https://issues.apache.org/jira/browse/ZOOKEEPER Thanks for the report! Patrick Steven Wong wrote: Using zkpython with ZK 3.2.1 release: import zookeeper as z zh = z.init(...) z.state(zh) # returns 3 == z.CONNECTED_STATE # kill standalone ZK server z.state(zh) # returns 0 == ??? The problem is that 0 is not a state defined by zookeeper.[ch]. I'm not sure whether 0 should've been defined or z.state should've returned something else. Steven
Re: UnsupportedClassVersionError when building zkpython
I've seen this before. Either you have an old version of ant, or your JAVA_HOME is not set, or it's set incorrectly (to 1.5 and ant is built for 1.6, or vice versa). Patrick Henry Robinson wrote: Hi Steven - I also see that problem if I build on my Mac sometimes. I'm looking into a proper fix, but for now you can do: ant compile sudo python src/python/setup.py install to build and install manually. If you have a moment, can you let me know which ant you are using? (ant -version) Thanks for bringing this up! Henry On Mon, Oct 12, 2009 at 9:06 PM, Steven Wong sw...@netflix.com wrote: Any idea how I can get it to build? ZooKeeper 3.2.1 (tarball release) on Mac OS X 10.5.8. Thanks. sw...@lgmac-swong:~/lib/zookeeper/src/contrib/zkpython 9173 sudo ant install Buildfile: build.xml BUILD FAILED java.lang.UnsupportedClassVersionError: Bad version number in .class file at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:675) at org.apache.tools.ant.AntClassLoader.defineClassFromData(AntClassLoader.j ava:1146) at org.apache.tools.ant.AntClassLoader.getClassFromStream(AntClassLoader.ja va:1324) at org.apache.tools.ant.AntClassLoader.findClassInComponents(AntClassLoader .java:1388) at org.apache.tools.ant.AntClassLoader.findClass(AntClassLoader.java:1341) at org.apache.tools.ant.AntClassLoader.loadClass(AntClassLoader.java:1088) at java.lang.ClassLoader.loadClass(ClassLoader.java:251) at org.apache.tools.ant.taskdefs.Available.checkClass(Available.java:446) at org.apache.tools.ant.taskdefs.Available.eval(Available.java:273) at org.apache.tools.ant.taskdefs.Available.execute(Available.java:225) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:288) at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:1 06) at org.apache.tools.ant.Task.perform(Task.java:348) at org.apache.tools.ant.Target.execute(Target.java:357) at org.apache.tools.ant.helper.ProjectHelper2.parse(ProjectHelper2.java:142 ) at org.apache.tools.ant.ProjectHelper.configureProject(ProjectHelper.java:9 3) at org.apache.tools.ant.Main.runBuild(Main.java:743) at org.apache.tools.ant.Main.startAnt(Main.java:217) at org.apache.tools.ant.launch.Launcher.run(Launcher.java:257) at org.apache.tools.ant.launch.Launcher.main(Launcher.java:104) Total time: 0 seconds sw...@lgmac-swong:~/lib/zookeeper/src/contrib/zkpython 9178 sudo javac -version javac 1.6.0_07
Re: Struggling with a simple configuration file.
Take all of the server.# lines out, including server.1 (no other change necessary). For standalone you don't need/want this. Alternately you could use org.apache.zookeeper.server.ZooKeeperServerMain (I don't think you even need to change the config file if you do that). for example: java -cp build/zookeeper-3.3.0.jar:build/lib/log4j-1.2.15.jar:conf org.apache.zookeeper.server.ZooKeeperServerMain zoo_q1.cfg works for me with the scenario you describe. Patrick Leonard Cuff wrote: I¹ve been developing for ZooKeeper for a couple months now, recently running in a test configuration with 3 ZooKeeper servers. I¹m running 3.2.1 with no problems. Recently I tried to move to a single server configuration for the development team environment, but couldn¹t get the configuration to work. I get the error java.lang.RuntimeException: My id 0 not in the peer list This would seem to imply that the myid file is set to zero. But ...it¹s set to 1. What¹s puzzling to me is my original configuration of servers was this: server.1=ind104.an.dev.fastclick.net:2182:2183 --- The machine I¹m trying to run standalone on. server.2=build101.an.dev.fastclick.net:2182:2183 server.3=cmedia101.an.dev.fastclick.net:2182:2183 I just removed the last two lines, and ran zkServer.sh start. It fails with the described log message. (Full log given below). When I put the server.2 and server.3 lines back in, it works fine, and is following the build101 machine. I decided to try changing the server.1 to server.0, also changed the myid file contents from 1 to zero. I get a very different error scenario: A continuously-occurring Null Pointer exception: 2009-10-09 04:22:36,284 - WARN [QuorumPeer:/0.0.0.0:2181:quorump...@490] - Unexpected exception java.lang.NullPointerException at org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(Fa stLeaderElection.java:466) at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLead erElection.java:635) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:488) I¹m at a loss to know where I¹ve gone astray. Thanks in advance for any and all help. Leonard --- the first log 2009-10-09 04:08:58,769 - INFO [main:quorumpeercon...@80] - Reading configuration from: /vcm/home/sandbox/ticket_161758-1/vcm/component/zookeeper/conf/zoo.cfg.dev 2009-10-09 04:08:58,795 - INFO [main:quorumpeerm...@118] - Starting quorum peer 2009-10-09 04:08:58,845 - FATAL [main:quorumpeerm...@86] - Unexpected exception, exiting abnormally java.lang.RuntimeException: My id 0 not in the peer list at org.apache.zookeeper.server.quorum.QuorumPeer.startLeaderElection(QuorumPeer .java:333) at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:314) at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMa in.java:137) at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPee rMain.java:102) at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:7 5) -- the second log 2009-10-09 04:22:36,284 - WARN [QuorumPeer:/0.0.0.0:2181:quorump...@490] - Unexpected exception java.lang.NullPointerException at org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(Fa stLeaderElection.java:466) at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLead erElection.java:635) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:488) 2009-10-09 04:22:36,285 - INFO [QuorumPeer:/0.0.0.0:2181:quorump...@487] - LOOKING 2009-10-09 04:22:36,285 - INFO [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@579] - New election: 12 2009-10-09 04:22:36,285 - INFO [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@618] - Notification: 0, 12, 43050, 0, LOOKING, LOOKING, 0 2009-10-09 04:22:36,285 - WARN [QuorumPeer:/0.0.0.0:2181:quorump...@490] - Unexpected exception java.lang.NullPointerException at org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(Fa stLeaderElection.java:466) at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLead erElection.java:635) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:488) 2009-10-09 04:22:36,286 - INFO [QuorumPeer:/0.0.0.0:2181:quorump...@487] - LOOKING 2009-10-09 04:22:36,286 - INFO [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@579] - New election: 12 2009-10-09 04:22:36,286 - INFO [QuorumPeer:/0.0.0.0:2181:fastleaderelect...@618] - Notification: 0, 12, 43051, 0, LOOKING, LOOKING, 0 2009-10-09 04:22:36,286 - WARN [QuorumPeer:/0.0.0.0:2181:quorump...@490] - Unexpected exception java.lang.NullPointerException at org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(Fa stLeaderElection.java:466) at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLead erElection.java:635) at
Re: feedback zkclient
You might want to add a link to zkclient on this page: http://wiki.apache.org/hadoop/ZooKeeper/UsefulTools Patrick Patrick Hunt wrote: Ted Dunning wrote: Judging by history and that fact that only 40/127 issues are resolved, 3.3 is probably 3-6 months away. Is that a fair assessment? Yes, that's fair. Patrick On Thu, Oct 1, 2009 at 11:13 AM, Patrick Hunt ph...@apache.org wrote: One nice thing about ephemeral is that the Stat contains the owner sessionid. As you say, it's highly implementation dependent. It's also something we recognize is a problem for users, we've slated it for 3.3.0 http://issues.apache.org/jira/browse/ZOOKEEPER-22
Re: feedback zkclient
I started looking a bit more closely at the source, some questions: 1) I tried generating the javadocs (see my fork of the project on github if you want my changes to build.xml for this) but it looks like there's pretty much no javadoc. Some information, particularly on semantics of user-exposed operations would be useful (esp re my earlier README comment - some high level document describing the benefits, etc... of the library) If I'm your proto-typical lazy developer (which I am :-) ), I'm really expecting some helpful docs to get me bootstrapped. 2) what purpose does ZkEventThread serve? 3) there's definitely an issue in the retryUntilConnected logic that you need to address let's say you call zkclient.create, and the connection to the server is lost while the request is in flight. At this point ConnectionLoss is thrown on the client side, however you (client) have no information on whether the server has made the change or not. The retry method's while loop will re-run the create (after reconnect), and the result seen by the caller (user code) could be either OK or may be NODEEXISTS exception, there's no way to know which. Mahadev is working on ZOOKEEPER-22 which will address this issue, but that's a future version, not today. 4) when I saw that you had separated zkclient and zkconnection I thought ah, this is interesting however when I saw the implementation I was confused: a) what purpose does this separation serve? b) I thought it was to allow multiple zkclients to share a single connection, however looking at zkclient.close, it closes the underlying connection. 5) there's a lot of wrapping of exceptions, looks like this is done in order to make them unchecked. Is this wise? How much simpler does it really make things? Esp things like interrupted exception? As you mentioned, one of your intents is to simplify things, but perhaps too simple? Some short, clear examples of usage would be helpful here to compare/contrast, I took a very quick look at some of the tests but that didn't help much. Is there a test(s) in particular that I should look at to see how zkclient is used, and the benefits incurred? Regards, Patrick Patrick Hunt wrote: Hi Stefan, two suggestions off the bat: 1) fill in something in the README, doesn't have to be final or polished, but give some insight into the what/why/how/where/goals/etc... to get things moving quickly for reviewers new users. 2) you should really discuss on the dev list. It's up to you to include user, but apache discourages use of user for development discussion (plus you'll pickup more developer insight there) Patrick Stefan Groschupf wrote: Hi Zookeeper developer, it would be great if you guys could give us some feedback about our project zkclient. http://github.com/joa23/zkclient The main idea is making the life of lazy developers that only want minimal zk functionality much easier. We have a functionality like zkclient mock making testing easy and fast without running a real zkserver, simple call back interfaces for the different event types, reconnecting handling in case of timeout etc. We feel we come closer to a release so it would be great if some experts could have a look and give us some feedback. Thanks, Stefan ~~~ Hadoop training and consulting http://www.scaleunlimited.com http://www.101tec.com
Re: feedback zkclient
Not to harp on this ;-) but this sounds like something that would be a very helpful addition to the README. Ted Dunning wrote: I think that another way to say this is that zkClient is going a bit for the Spring philosophy that if the caller can't (or won't) be handling the situation, then they shouldn't be forced to declare it. The Spring jdbcTemplate is a grand example of the benefits of this. First implementations of this policy generally are a bit too broad, though, so this should be examined carefully. On Thu, Oct 1, 2009 at 8:05 AM, Peter Voss i...@petervoss.org wrote: 5) there's a lot of wrapping of exceptions, looks like this is done in order to make them unchecked. Is this wise? How much simpler does it really make things? Esp things like interrupted exception? As you mentioned, one of your intents is to simplify things, but perhaps too simple? Some short, clear examples of usage would be helpful here to compare/contrast, I took a very quick look at some of the tests but that didn't help much. Is there a test(s) in particular that I should look at to see how zkclient is used, and the benefits incurred? Checked exceptions are very painful when you are assembling together a larger number of libraries (which is true for most enterprise applications). Either you wind up having a general throws Exception (which I don't really like, because it's too general) at most of your interfaces, or you have to wrap checked exceptions into runtime exceptions. We didn't want a library to introduce yet another checked exception that you MUST catch or rethrow. I know that there are different opinions about that, but that's the idea behind this. Similar situation for the InterruptedException. ZkClient also converts this to a runtime exception and makes sure that the interrupted flag doesn't get cleared. There are just too many existing libraries that have a catch (Exception e) somewhere that totally ignores that this would reset the interrupt flag, if e is an InterruptedException. Therefore we better avoid having all of the methods throwing that exception.
Re: feedback zkclient
Ted Dunning wrote: You may be able to tell if the file is yours be examining the content and ownership, but this is pretty implementation dependent. In particular, it makes queues very difficult to implement correctly. If this happens during the creation of an ephemeral file, the only option may be to close the connection (thus deleting all ephemeral files) and start over. One nice thing about ephemeral is that the Stat contains the owner sessionid. As you say, it's highly implementation dependent. It's also something we recognize is a problem for users, we've slated it for 3.3.0 http://issues.apache.org/jira/browse/ZOOKEEPER-22 Patrick On Thu, Oct 1, 2009 at 8:05 AM, Peter Voss i...@petervoss.org wrote: 3) there's definitely an issue in the retryUntilConnected logic that you need to address let's say you call zkclient.create, and the connection to the server is lost while the request is in flight. At this point ConnectionLoss is thrown on the client side, however you (client) have no information on whether the server has made the change or not. The retry method's while loop will re-run the create (after reconnect), and the result seen by the caller (user code) could be either OK or may be NODEEXISTS exception, there's no way to know which. Mahadev is working on ZOOKEEPER-22 which will address this issue, but that's a future version, not today. Good catch. I wasn't aware that nodes could still be have been created when receiving a ConnectionLoss. But how would you deal with that? If we create a znode and get a ConnectionLoss exception, then wait until the connection is back and check if the znode is there. There is no way of knowing whether it was us who created the node or somebody else, right?
Re: feedback zkclient
Ted Dunning wrote: Judging by history and that fact that only 40/127 issues are resolved, 3.3 is probably 3-6 months away. Is that a fair assessment? Yes, that's fair. Patrick On Thu, Oct 1, 2009 at 11:13 AM, Patrick Hunt ph...@apache.org wrote: One nice thing about ephemeral is that the Stat contains the owner sessionid. As you say, it's highly implementation dependent. It's also something we recognize is a problem for users, we've slated it for 3.3.0 http://issues.apache.org/jira/browse/ZOOKEEPER-22
Re: How do we find the Server the client is connected to?
That detail is purposefully not exposed through the client api, however it is output to the log on connection establishment. Why would your client code need to know which server in the ensemble it is connected to? Patrick Rob Baccus wrote: How do I determine the server the client is connected to? It is not exposed as far as I can see in either the ZooKeep object or the ClentCnxn object. I did find on line 790 in ClientCnxn.StartConnect() method the place the actual server connection is happening but that is not exposed. Rob Baccus 425-201-3812
Re: How do we find the Server the client is connected to?
It's possible, but not pretty. Try this: 1) create a subclass of ZooKeeper to be used in your tests 2) in the subclass add something like this: public String getConnectedServer() { return ((SocketChannel)cnxn.sendThread.sockKey.channel()).socket() .getInetAddress().toString(); } Feel free to add a JIRA, I think we could make this a protected method on ZooKeeper to make testing easier (and not expose internals). Regards, Patrick Todd Greenwood wrote: Failover testing. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, October 01, 2009 3:44 PM To: zookeeper-user@hadoop.apache.org; Rob Baccus Subject: Re: How do we find the Server the client is connected to? That detail is purposefully not exposed through the client api, however it is output to the log on connection establishment. Why would your client code need to know which server in the ensemble it is connected to? Patrick Rob Baccus wrote: How do I determine the server the client is connected to? It is not exposed as far as I can see in either the ZooKeep object or the ClentCnxn object. I did find on line 790 in ClientCnxn.StartConnect() method the place the actual server connection is happening but that is not exposed. Rob Baccus 425-201-3812
Re: How do we find the Server the client is connected to?
Possible, but very ugly. I do something similar to this in zk tests: org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testBadPeerAddressInQuorum() if you want to see an example. Patrick Ted Dunning wrote: Grovel the logs. On Thu, Oct 1, 2009 at 3:46 PM, Todd Greenwood to...@audiencescience.comwrote: Failover testing. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, October 01, 2009 3:44 PM To: zookeeper-user@hadoop.apache.org; Rob Baccus Subject: Re: How do we find the Server the client is connected to? That detail is purposefully not exposed through the client api, however it is output to the log on connection establishment. Why would your client code need to know which server in the ensemble it is connected to? Patrick Rob Baccus wrote: How do I determine the server the client is connected to? It is not exposed as far as I can see in either the ZooKeep object or the ClentCnxn object. I did find on line 790 in ClientCnxn.StartConnect() method the place the actual server connection is happening but that is not exposed. Rob Baccus 425-201-3812
Re: problem starting ensemble mode
Hi Hector, looks like a connectivity issue to me: NoRouteToHostException. 3888 is the election port 2888 is the quorum port basically, the ensemble uses the election port for leader election. Once a leader is elected it then uses the quorum port for subsequent communication. Could it be a firewall issue? Your configs/logs look ok to me otw. Try using something like telnet to verify connectivity on the 3888 2888 ports between the two servers. Patrick Hector Yuen wrote: Hi all, I am trying to start zookeeper in two nodes, the configuration file I have is tickTime=2000 initLimit=10 syncLimit=5 dataDir=/var/zookeeper clientPort=2181 server.1=hec-bp1:2888:3888 server.2=hec-bp2:2888:3888 i also have two files /var/zookeeper/myid on each of the machines, the files contain 1 and 2 on each of the servers When I start, I get the following Starting zookeeper ... STARTED hec...@hec-bp2:/zookeeper$ 2009-10-01 15:48:15,786 - INFO [main:quorumpeercon...@80] - Reading configuration from: /zookeeper/bin/../conf/zoo.cfg 2009-10-01 15:48:15,882 - INFO [main:quorumpeercon...@232] - Defaulting to majority quorums 2009-10-01 15:48:15,899 - INFO [main:quorumpeerm...@118] - Starting quorum peer 2009-10-01 15:48:15,943 - INFO [Thread-1:quorumcnxmanager$liste...@409] - My election bind port: 3888 2009-10-01 15:48:15,961 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@487] - LOOKING 2009-10-01 15:48:15,963 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@579] - New election: -1 2009-10-01 15:48:15,978 - WARN [WorkerSender Thread:quorumcnxmana...@336] - Cannot open channel to 1 at election address hec-bp1.admin.nimblestorage.com/10.12.6.192:3888 java.net.NoRouteToHostException: No route to host at sun.nio.ch.Net.connect(Native Method) at sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at java.nio.channels.SocketChannel.open(Unknown Source) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:323) at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:302) at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:323) at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:296) at java.lang.Thread.run(Unknown Source) 2009-10-01 15:48:15,981 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@618] - Notification: 2, -1, 1, 2, LOOKING, LOOKING, 2 2009-10-01 15:48:15,981 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@642] - Adding vote 2009-10-01 15:48:16,184 - WARN [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorumcnxmana...@336] - Cannot open channel to 1 at election address hec-bp1.admin.nimblestorage.com/10.12.6.192:3888 I can expect these kind of messages when the other server hasn't been started, but even after a while keeps sending these messages. I can ping and ssh between the machines. I noticed that just port 3888 is listening when I do netstat -an, why is port 2888 not being used? Any ideas? Thanks -h
Re: feedback zkclient
Hi Stefan, two suggestions off the bat: 1) fill in something in the README, doesn't have to be final or polished, but give some insight into the what/why/how/where/goals/etc... to get things moving quickly for reviewers new users. 2) you should really discuss on the dev list. It's up to you to include user, but apache discourages use of user for development discussion (plus you'll pickup more developer insight there) Patrick Stefan Groschupf wrote: Hi Zookeeper developer, it would be great if you guys could give us some feedback about our project zkclient. http://github.com/joa23/zkclient The main idea is making the life of lazy developers that only want minimal zk functionality much easier. We have a functionality like zkclient mock making testing easy and fast without running a real zkserver, simple call back interfaces for the different event types, reconnecting handling in case of timeout etc. We feel we come closer to a release so it would be great if some experts could have a look and give us some feedback. Thanks, Stefan ~~~ Hadoop training and consulting http://www.scaleunlimited.com http://www.101tec.com
Re: The idea behind 'myid'
Not sure if you'll find this interesting but my zk configuration generator is available on github: http://github.com/phunt/zkconf zkconf.py will generate all of the configuration needed to run a ZooKeeper ensemble. I mainly use this tool for localhost based testing, but it can generate configurations for any list of servers (see the —server option). Patrick Eric Bowman wrote: Benjamin Reed wrote: you and ted are correct. the id gives zookeeper a stable identifier to use even if the ip address changes. if the ip address doesn't change, we could use that, but we didn't want to make that a built in assumption. if you really do have a rock solid ip address, you could make a wrapper startup script that starts up and creates the myid file based on the ip address. i gotta say though, i've found that such assumptions are often found to be invalid. Yeah, it can be tricky. In more than one cluster, I've seen a set of static configuration files that gets replicated everywhere. If an individual instance needs per-instance configuration, we do that from the command line (using -D). Maybe logic can do it, or maybe a start script has to load a machine local file, whatever. It's a pretty common paradigm, though. It's hardly the end of the world, but it is definitely something my ops people stumbled over.
Re: ACL question w/ Zookeeper 3.1.1
- this = {org.apache.zookeeper.zookee...@1267} watchManager = {org.apache.zookeeper.zookeeper$zkwatchmana...@1379} state = {org.apache.zookeeper.zookeeper$sta...@1453}CLOSED cnxn = {org.apache.zookeeper.clientc...@1381}sessionId: 0x123de5b3b1b\nlastZxid: 1\nxid: 3\nnextAddrToTry: 0\nserverAddrs: /127.0.0.1:2181\n serverAddrs = {java.util.arrayl...@1386} size = 1 authInfo = {java.util.arrayl...@1387} size = 1 [0] = {org.apache.zookeeper.clientcnxn$authd...@1398} scheme = {java.lang.str...@1244}digest data = {byte[...@1399} pendingQueue = {java.util.linkedl...@1388} size = 0 outgoingQueue = {java.util.linkedl...@1389} size = 0 nextAddrToTry = 0 connectTimeout = 4 readTimeout = 2 sessionTimeout = 5 zooKeeper = {org.apache.zookeeper.zookee...@1267} watcher = {org.apache.zookeeper.zookeeper$zkwatchmana...@1379} sessionId = 82153701637816320 sessionPasswd = {byte[...@1390} sendThread = {org.apache.zookeeper.clientcnxn$sendthr...@1259}Thread[main-SendThread ,5,] eventThread = {org.apache.zookeeper.clientcnxn$eventthr...@1266}Thread[main-EventThre ad,5,main] selector = {sun.nio.ch.epollselectori...@1391} closing = false eventOfDeath = {java.lang.obj...@1392} lastZxid = 1 xid = 3 response = {org.apache.zookeeper.proto.createrespo...@1365}\n r = {org.apache.zookeeper.proto.replyhea...@1445}0,0,-112\n request = {org.apache.zookeeper.proto.createrequ...@1360}'/ACLTest,,v{s{31,s{'aut h,'}}},0\n path = {java.lang.str...@1314}/ACLTest data = {byte...@1339} acl = {java.util.arrayl...@1242} size = 1 flags = 0 path = {java.lang.str...@1314}/ACLTest h = {org.apache.zookeeper.proto.requesthea...@1352}2,1\n cnxn = {org.apache.zookeeper.clientc...@1381}sessionId: 0x123de5b3b1b\nlastZxid: 1\nxid: 3\nnextAddrToTry: 0\nserverAddrs: /127.0.0.1:2181\n -- v5 NOTE: If I use Ids.OPEN_ACL_UNSAFE, then everything works fine. Here's an example of the debug state after a create()... -- this = {org.apache.zookeeper.zookee...@1266} watchManager = {org.apache.zookeeper.zookeeper$zkwatchmana...@1397} state = {org.apache.zookeeper.zookeeper$sta...@1398}CONNECTED cnxn = {org.apache.zookeeper.clientc...@1374}sessionId: 0x123de6ba8de\nlastZxid: 2\nxid: 3\nnextAddrToTry: 0\nserverAddrs: /127.0.0.1:2181\n serverAddrs = {java.util.arrayl...@1403} size = 1 authInfo = {java.util.arrayl...@1404} size = 1 [0] = {org.apache.zookeeper.clientcnxn$authd...@1415} scheme = {java.lang.str...@1244}digest data = {byte[...@1416} pendingQueue = {java.util.linkedl...@1405} size = 0 outgoingQueue = {java.util.linkedl...@1406} size = 0 nextAddrToTry = 0 connectTimeout = 4 readTimeout = 2 sessionTimeout = 5 zooKeeper = {org.apache.zookeeper.zookee...@1266} watcher = {org.apache.zookeeper.zookeeper$zkwatchmana...@1397} sessionId = 82153772198789120 sessionPasswd = {byte[...@1407} sendThread = {org.apache.zookeeper.clientcnxn$sendthr...@1259}Thread[main-SendThread ,5,main] eventThread = {org.apache.zookeeper.clientcnxn$eventthr...@1265}Thread[main-EventThre ad,5,main] selector = {sun.nio.ch.epollselectori...@1408} closing = false eventOfDeath = {java.lang.obj...@1409} lastZxid = 2 xid = 3 response = {org.apache.zookeeper.proto.createrespo...@1360}'/ACLTest\n r = {org.apache.zookeeper.proto.replyhea...@1389}2,2,0\n xid = 2 zxid = 2 err = 0 request = {org.apache.zookeeper.proto.createrequ...@1355}'/ACLTest,,v{s{15,s{'wor ld,'anyone}}},0\n path = {java.lang.str...@1314}/ACLTest h = {org.apache.zookeeper.proto.requesthea...@1347}2,1\n cnxn = {org.apache.zookeeper.clientc...@1374}sessionId: 0x123de6ba8de\nlastZxid: 2\nxid: 3\nnextAddrToTry: 0\nserverAddrs: /127.0.0.1:2181\n -Original Message- From: Todd Greenwood [mailto:to...@audiencescience.com] Sent: Friday, September 18, 2009 11:27 AM To: Patrick Hunt; zookeeper-...@hadoop.apache.org; zookeeper- u...@hadoop.apache.org Subject: RE: ACL question w/ Zookeeper 3.1.1 Patrick / Mahadev, Thanks for the heads-up! Apparently I *am* receiving email from zookeeper-user but it is being filtered out as spam. This just started happening, but I'll rectify on my end. I'm working thru Mahadev's response and will respond shortly (and search for other postings, as well). Appologies for the cross post. -Todd -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Friday, September 18, 2009 11:19 AM To: zookeeper-...@hadoop.apache.org; zookeeper-user@hadoop.apache.org Cc: Todd Greenwood Subject: Re: ACL question w/ Zookeeper 3.1.1 Todd, there were other responses as well. Are you seeing other traffic from the lists? (perhaps a spam filtering issue?) Patrick Mahadev Konar wrote: HI todd, We did respond on zookeeper-user. Here is my response in case you didn't see it... HI todd, From what I understand, you are sayin that a creator_all_acl does not work with auth? I tried the following with CREATOR_ALL_ACL and it seemed to work for me... import org.apache.zookeeper.CreateMode; import
Re: ACL question w/ Zookeeper 3.1.1
Todd Greenwood wrote: Patrick, Thanks, I'll spend some more time trying to create a more concise repro, and log a bug once I do. The only reason I posted this mash was to see if the replyHeader error, 0,0,-112, made sense of the ACL exception. The rest is just context...and clearly too much of that :o). I don't see a difference between v3 and v4...The only differences that I can see are the between v4 and v5 (v4 fails and v5 succeeds): I did see this diff btw 3/4, 3 has this: request = {org.apache.zookeeper.proto.createrequ...@1360}'/ACLTest,,v{},0\n you don't have any acl specified for the node create, or is this supposed to be a working example w/o auth? (like I said, I'm confused...) v4: response = {org.apache.zookeeper.proto.createrespo...@1365}\n r = {org.apache.zookeeper.proto.replyhea...@1445}0,0,-112\n -112 return code is session expired, not auth failure. according to this your client's session expired, but w/o more info (code/log or idea of what your test is doing) I can't really speculate why you are getting this (old client session that was not shutdown correctly and finally expired while running a different/new test?) Patrick v5: response = {org.apache.zookeeper.proto.createrespo...@1360}'/ACLTest\n r = {org.apache.zookeeper.proto.replyhea...@1389}2,2,0\n -Todd -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Monday, September 21, 2009 4:14 PM To: zookeeper-user@hadoop.apache.org; Todd Greenwood Subject: Re: ACL question w/ Zookeeper 3.1.1 Todd, I spent some time looking at your output and honestly I'm having trouble making sense of what you are saying. What's the diff btw v3 v4? I'm afraid here are too many variables, can you help nail things down? 1) create a jira for this https://issues.apache.org/jira/browse/ZOOKEEPER 2) if at all possible attach the code you are running that has problems, seems like you've boiled it down to a case where it is deterministic, this would be the best for us to debug. If you can't attach the code then include snippets - in particular the addAuthInfo call (w/parameter details) for your clients, and the individual create calls, including the acl specifics - and describe what your client(s) are doing in detail so that we can attempt to reproduce. 3) attach a trace level log from both the server and client during your test run, point out the time index when you see the auth failure. btw, you might try doing a getACL(path...) just before the operation that's failing - it will give you some insight into what the acl is set to for that node. Patrick Todd Greenwood wrote: Patrick / Mahadev, I've spent the last couple of days attempting to isolate this issue, and this is what I've come up with... Mahadev's simple use case works fine, as posted. However, my more involved use cases are consistently failing w/ InvalidACL exceptions when I use digest authentication with Ids.CREATOR_ALL_ACL: java.lang.Exception: com.audiencescience.util.zookeeper.wrapper.ZooWrapperException: org.apache.zookeeper.KeeperException$InvalidACLException: KeeperErrorCode = InvalidACL for /ACLTest Prior to throwing this exception, the response is (Zookeeper.java:create()): r = {org.apache.zookeeper.proto.replyhea...@1445}0,0,-112\n mailto:{org.apache.zookeeper.proto.replyhea...@1445} . More debug data below. So, while I can get Mahadev's simple example to work, I cannot get a more involved use case to work correctly. However, if I change my code to use Ids.OPEN_ACL_UNSAFE, then everything works fine. Example debug output below at v5. Could someone point me at non-trivial test cases for ACLs, and perhaps give me some insight into how to debug this issue further? -Todd --- Code Snippet ZooKeeper.java --- public String create(String path, byte data[], ListACL acl, CreateMode createMode) throws KeeperException, InterruptedException { validatePath(path); RequestHeader h = new RequestHeader(); h.setType(ZooDefs.OpCode.create); CreateRequest request = new CreateRequest(); CreateResponse response = new CreateResponse(); request.setData(data); request.setFlags(createMode.toFlag()); request.setPath(path); if (acl != null acl.size() == 0) { throw new KeeperException.InvalidACLException(); } request.setAcl(acl); ReplyHeader r = cnxn.submitRequest(h, request, response, null); v3 v5 if (r.getErr() != 0) { v4 throw KeeperException.create(KeeperException.Code.get(r.getErr()), path); } return response.getPath(); } - v3 - this = {org.apache.zookeeper.zookee...@1267} watchManager = {org.apache.zookeeper.zookeeper$zkwatchmana...@1379} state
Re: zookeeper on ec2
What is your client timeout? It may be too low. also see this section on handling recoverable errors: http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling connection loss in particular needs special care since: When a ZooKeeper client loses a connection to the ZooKeeper server there may be some requests in flight; we don't know where they were in their flight at the time of the connection loss. Patrick Satish Bhatti wrote: I have recently started running on EC2 and am seeing quite a few ConnectionLoss exceptions. Should I just catch these and retry? Since I assume that eventually, if the shit truly hits the fan, I will get a SessionExpired? Satish On Mon, Jul 6, 2009 at 11:35 AM, Ted Dunning ted.dunn...@gmail.com wrote: We have used EC2 quite a bit for ZK. The basic lessons that I have learned include: a) EC2's biggest advantage after scaling and elasticity was conformity of configuration. Since you are bringing machines up and down all the time, they begin to act more like programs and you wind up with boot scripts that give you a very predictable environment. Nice. b) EC2 interconnect has a lot more going on than in a dedicated VLAN. That can make the ZK servers appear a bit less connected. You have to plan for ConnectionLoss events. c) for highest reliability, I switched to large instances. On reflection, I think that was helpful, but less important than I thought at the time. d) increasing and decreasing cluster size is nearly painless and is easily scriptable. To decrease, do a rolling update on the survivors to update their configuration. Then take down the instance you want to lose. To increase, do a rolling update starting with the new instances to update the configuration to include all of the machines. The rolling update should bounce each ZK with several seconds between each bounce. Rescaling the cluster takes less than a minute which makes it comparable to EC2 instance boot time (about 30 seconds for the Alestic ubuntu instance that we used plus about 20 seconds for additional configuration). On Mon, Jul 6, 2009 at 4:45 AM, David Graf david.g...@28msec.com wrote: Hello I wanna set up a zookeeper ensemble on amazon's ec2 service. In my system, zookeeper is used to run a locking service and to generate unique id's. Currently, for testing purposes, I am only running one instance. Now, I need to set up an ensemble to protect my system against crashes. The ec2 services has some differences to a normal server farm. E.g. the data saved on the file system of an ec2 instance is lost if the instance crashes. In the documentation of zookeeper, I have read that zookeeper saves snapshots of the in-memory data in the file system. Is that needed for recovery? Logically, it would be much easier for me if this is not the case. Additionally, ec2 brings the advantage that serves can be switch on and off dynamically dependent on the load, traffic, etc. Can this advantage be utilized for a zookeeper ensemble? Is it possible to add a zookeeper server dynamically to an ensemble? E.g. dependent on the in-memory load? David
Re: zookeeper on ec2
I'm not very familiar with ec2 environment, are you doing any monitoring? In particular network connectivity btw nodes? Sounds like networking issues btw nodes (I'm assuming you've also looked at stuff like this http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting and verified that you are not swapping (see gc pressure), etc...) Patrick Satish Bhatti wrote: Session timeout is 30 seconds. On Tue, Sep 1, 2009 at 4:26 PM, Patrick Hunt ph...@apache.org wrote: What is your client timeout? It may be too low. also see this section on handling recoverable errors: http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling connection loss in particular needs special care since: When a ZooKeeper client loses a connection to the ZooKeeper server there may be some requests in flight; we don't know where they were in their flight at the time of the connection loss. Patrick Satish Bhatti wrote: I have recently started running on EC2 and am seeing quite a few ConnectionLoss exceptions. Should I just catch these and retry? Since I assume that eventually, if the shit truly hits the fan, I will get a SessionExpired? Satish On Mon, Jul 6, 2009 at 11:35 AM, Ted Dunning ted.dunn...@gmail.com wrote: We have used EC2 quite a bit for ZK. The basic lessons that I have learned include: a) EC2's biggest advantage after scaling and elasticity was conformity of configuration. Since you are bringing machines up and down all the time, they begin to act more like programs and you wind up with boot scripts that give you a very predictable environment. Nice. b) EC2 interconnect has a lot more going on than in a dedicated VLAN. That can make the ZK servers appear a bit less connected. You have to plan for ConnectionLoss events. c) for highest reliability, I switched to large instances. On reflection, I think that was helpful, but less important than I thought at the time. d) increasing and decreasing cluster size is nearly painless and is easily scriptable. To decrease, do a rolling update on the survivors to update their configuration. Then take down the instance you want to lose. To increase, do a rolling update starting with the new instances to update the configuration to include all of the machines. The rolling update should bounce each ZK with several seconds between each bounce. Rescaling the cluster takes less than a minute which makes it comparable to EC2 instance boot time (about 30 seconds for the Alestic ubuntu instance that we used plus about 20 seconds for additional configuration). On Mon, Jul 6, 2009 at 4:45 AM, David Graf david.g...@28msec.com wrote: Hello I wanna set up a zookeeper ensemble on amazon's ec2 service. In my system, zookeeper is used to run a locking service and to generate unique id's. Currently, for testing purposes, I am only running one instance. Now, I need to set up an ensemble to protect my system against crashes. The ec2 services has some differences to a normal server farm. E.g. the data saved on the file system of an ec2 instance is lost if the instance crashes. In the documentation of zookeeper, I have read that zookeeper saves snapshots of the in-memory data in the file system. Is that needed for recovery? Logically, it would be much easier for me if this is not the case. Additionally, ec2 brings the advantage that serves can be switch on and off dynamically dependent on the load, traffic, etc. Can this advantage be utilized for a zookeeper ensemble? Is it possible to add a zookeeper server dynamically to an ensemble? E.g. dependent on the in-memory load? David
Re: zookeeper on ec2
Depends on what your tests are. Are they pretty simple/light? then probably network issue. Heavy load testing? then might be the server/client, might be the network. easiest thing is to run a ping test while running your zk test and see if pings are getting through (and latency). You should also review your client/server logs for any information during the CLoss. Ted Dunning would be a good resource - he runs ZK inside ec2 and has alot of experience with it. Patrick Satish Bhatti wrote: For my initial testing I am running with a single ZooKeeper server, i.e. the ensemble only has one server. Not sure if this is exacerbating the problem? I will check out the trouble shooting link you sent me. On Tue, Sep 1, 2009 at 5:01 PM, Patrick Hunt ph...@apache.org wrote: I'm not very familiar with ec2 environment, are you doing any monitoring? In particular network connectivity btw nodes? Sounds like networking issues btw nodes (I'm assuming you've also looked at stuff like this http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting and verified that you are not swapping (see gc pressure), etc...) Patrick Satish Bhatti wrote: Session timeout is 30 seconds. On Tue, Sep 1, 2009 at 4:26 PM, Patrick Hunt ph...@apache.org wrote: What is your client timeout? It may be too low. also see this section on handling recoverable errors: http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling connection loss in particular needs special care since: When a ZooKeeper client loses a connection to the ZooKeeper server there may be some requests in flight; we don't know where they were in their flight at the time of the connection loss. Patrick Satish Bhatti wrote: I have recently started running on EC2 and am seeing quite a few ConnectionLoss exceptions. Should I just catch these and retry? Since I assume that eventually, if the shit truly hits the fan, I will get a SessionExpired? Satish On Mon, Jul 6, 2009 at 11:35 AM, Ted Dunning ted.dunn...@gmail.com wrote: We have used EC2 quite a bit for ZK. The basic lessons that I have learned include: a) EC2's biggest advantage after scaling and elasticity was conformity of configuration. Since you are bringing machines up and down all the time, they begin to act more like programs and you wind up with boot scripts that give you a very predictable environment. Nice. b) EC2 interconnect has a lot more going on than in a dedicated VLAN. That can make the ZK servers appear a bit less connected. You have to plan for ConnectionLoss events. c) for highest reliability, I switched to large instances. On reflection, I think that was helpful, but less important than I thought at the time. d) increasing and decreasing cluster size is nearly painless and is easily scriptable. To decrease, do a rolling update on the survivors to update their configuration. Then take down the instance you want to lose. To increase, do a rolling update starting with the new instances to update the configuration to include all of the machines. The rolling update should bounce each ZK with several seconds between each bounce. Rescaling the cluster takes less than a minute which makes it comparable to EC2 instance boot time (about 30 seconds for the Alestic ubuntu instance that we used plus about 20 seconds for additional configuration). On Mon, Jul 6, 2009 at 4:45 AM, David Graf david.g...@28msec.com wrote: Hello I wanna set up a zookeeper ensemble on amazon's ec2 service. In my system, zookeeper is used to run a locking service and to generate unique id's. Currently, for testing purposes, I am only running one instance. Now, I need to set up an ensemble to protect my system against crashes. The ec2 services has some differences to a normal server farm. E.g. the data saved on the file system of an ec2 instance is lost if the instance crashes. In the documentation of zookeeper, I have read that zookeeper saves snapshots of the in-memory data in the file system. Is that needed for recovery? Logically, it would be much easier for me if this is not the case. Additionally, ec2 brings the advantage that serves can be switch on and off dynamically dependent on the load, traffic, etc. Can this advantage be utilized for a zookeeper ensemble? Is it possible to add a zookeeper server dynamically to an ensemble? E.g. dependent on the in-memory load? David
Re: Creating ephemeral nodes: First time returns ZNODEEXISTS
Hi Leonard, Between 00:43:23,035 and 00:43:23,157 I see client session 0x123730dbe6e0001 get 15 node exists exceptions in a row. Are you expecting this? (ie are you attempting to create this node 15 times in a row or is this unexpected? I can't tell from the client snippet you included) Are you using the multithreaded version of the client or the single threaded version? It would be useful if you could create a JIRA and attach the following: https://issues.apache.org/jira/browse/ZOOKEEPER 1) create the jira and attach the client server logs 2) attach zoo.cfg file 3) if you can attach a copy of your client code that would be interesting as well. In your email we see the create, however it would be interesting to see the session creation call you are doing and how the client is initialized. I would also suggest that you examine the cli_mt or cli_st client that's included with the release (build cli.c in src/c/src). Use this client to connect to your server and examine the node that your create_ephemeral_znode is operating on. For example try an get (create, etc...) call and check the stat information (if ephemeralOwner is zero then it's not an ephemeral node for example). You can also take a look at the code for cli.c (or the tests) for some examples of how to initialize/run a mt/st client process and compare it to your own. Btw, we suggest that users use the multi-threaded client whenever possible, unless ST is specifically required. Patrick Leonard Cuff wrote: I've just upgraded to ZooKeeper 3.2.0, testing the fix for ephemeral nodes disappearing correctly when the client process dies. I'm seeing some unexpected behavior: The first time I call zoo_create() to create an ephemeral node, it returns ZNODEEXISTS, even if the node didn't previously exist. It successfully creates the znode, and the znode is then available to write to. If I delete the node and call zoo_create() a second time, it correctly returns ZOK. I'm wondering if anyone else is seeing the same thing. I've spent quite a bit of time checking my client code, and I'm pretty convinced this is happening as described. I see lots of exceptions in the log file. I'll append my client code, and the log file. Thanks for any and all input. Leonard create_ephemeral_znode ( char *znode ) { int result; result = zoo_create ( zh, znode, NULL, 0, ZOO_OPEN_ACL_UNSAFE, ZOO_EPHEMERAL , NULL, 0 ); if ( result == ZOK ) { vcm_log_info ( logger, ephemeral status node '%s' created successfuly, znode); return SUCCESS; } if ( result == ZNODEEXISTS ) { vcm_log_debug ( logger, ephemeral status node '%s' already exists. Will delete and recreate, znode); result = zoo_delete (zh, znode, ANY_VERSION ); vcm_log_info ( logger, delete ephemeral node '%s' returned %d (%s), znode, result, zerror ( result ) ); if ( result != ZOK result != ZNONODE ) { vcm_log_error ( logger, Couldn't delete ephemeral node '%s' because %d (%s) %s at %d, znode, result, zerror ( result ) , __FILE__, __LINE__); } result = zoo_create ( zh, znode, NULL, 0, ZOO_OPEN_ACL_UNSAFE, ZOO_EPHEMERAL , NULL, 0 ); if ( result == ZOK ) { vcm_log_info ( logger, ephemeral status node '%s' created successfuly, znode); return SUCCESS; } vcm_log_warn ( logger, Can't create '%s' (err: %d : %s), %s at %d, znode, result , zerror ( result ), __FILE__, __LINE__); return ZOO_PATH_CREATION_FAILED; } vcm_log_warn ( logger, Can't create '%s' (err: %d : %s), %s at %d, znode, result , zerror ( result ), __FILE__, __LINE__); return ZOO_PATH_CREATION_FAILED; } -- JMX enabled by default Starting zookeeper ... zoopid file is /vcm/home/zoo/zookeeper-3.2.0/snap/zookeeper_server.pid STARTED 2009-09-01 00:42:43,535 - INFO [main:quorumpeercon...@80] - Reading configuration from: /vcm/home/sandbox/ticket_161758-1/vcm/apps/zookeeper-3.2.0/bin/../conf/zoo.c fg 2009-09-01 00:42:43,539 - WARN [main:quorumpeerm...@104] - Either no config or no quorum defined in config, running in standalone mode 2009-09-01 00:42:43,560 - INFO [main:quorumpeercon...@80] - Reading configuration from: /vcm/home/sandbox/ticket_161758-1/vcm/apps/zookeeper-3.2.0/bin/../conf/zoo.c fg 2009-09-01 00:42:43,562 - INFO [main:zookeeperserverm...@94] - Starting server 2009-09-01 00:42:43,573 - INFO [main:environm...@97] - Server environment:zookeeper.version=3.2.0-790261, built on 07/01/2009 16:49 GMT 2009-09-01 00:42:43,574 - INFO [main:environm...@97] - Server environment:host.name=ind104.an.dev.fastclick.net 2009-09-01 00:42:43,574 - INFO [main:environm...@97] - Server environment:java.version=1.6.0_14 2009-09-01 00:42:43,575 - INFO [main:environm...@97] - Server environment:java.vendor=Sun Microsystems Inc. 2009-09-01 00:42:43,576 - INFO
Re: configuring Zookeeper in HBase with IP addresses only
Nice! Jean-Daniel Cryans wrote: Added here http://wiki.apache.org/hadoop/Hbase/Troubleshooting#12 J-D On Mon, Aug 24, 2009 at 5:20 PM, Patrick Huntph...@apache.org wrote: No worries. The details are actually interesting/useful, you might consider adding to your docs in case another user runs into this. Patrick Jean-Daniel Cryans wrote: Patrick, Basically, yes. Sorry for the lengthy answer ;) J-D On Mon, Aug 24, 2009 at 5:09 PM, Patrick Huntph...@apache.org wrote: I see, so an inconsistency then wrt name lookup. Thanks! Patrick Jean-Daniel Cryans wrote: Well the situation is that HBase now generates the myid files and to find the id we look in the hbase.zookeeper.quorum configuration that itself generates a temporary zoo.cfg file. To do that we have to somehow match the machine's own knowledge of its address with what's in that list. To find our address we use org.apache.hadoop.net.DNS with the method getDefaultHost and then we go through the list of machines defined in the HBase configuration. What comes out of DNS relies on how the OS is configured or it asks a specified dns server (if provided). So, in David's situation, he specified an IP address and DNS returns a hostname so we don't get a match. The resolution in that case is to fix the configuration by passing hostnames, to change the OS configuration, to setup a DNS server or to configure/start zookeeper by hand. From what I've seen, that stuff is never easier but eh, we still get you a quorum running in the end :P J-D On Mon, Aug 24, 2009 at 4:37 PM, Patrick Huntph...@apache.org wrote: Hi Jean-Daniel, not sure I get your response fully. Are you saying that the configured ip addr was resolved to a hostname, but that hostname didn't match the list of ip addresses used when defining the zk quorum machines? Is there a workaround you could suggest for ppl who don't have DNS available? Should an Hbase JIRA be created for this -- ie is it something you consider should be fixed/improved? Patrick Jean-Daniel Cryans wrote: Oh ok well HBase relies on the DNS class shipped with Hadoop to determine your address. It will try to use a hostname if possible but what comes out of there really depends on your OS configuration. In your case, that means that it resolved a hostname instead of an IP (which is rare) so you should use it instead. Also this is HBase-specific, ZK isn't really involved. J-D On Mon, Aug 24, 2009 at 3:47 PM, Pythonnerpython...@gmail.com wrote: I forgot to post that line: property namehbase.zookeeper.quorum/name value192.168.1.xx/value /property ok, I'll check the guide shipped with HBase. On Mon, Aug 24, 2009 at 3:43 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: David, hbase.master is deprecated in HBase 0.20, instead you have to specify hbase.zookeeper.quorum if you want to use HBase in a distributed mode with a ZK quorum. Please see the Getting Started documentation shipped with HBase. J-D On Mon, Aug 24, 2009 at 3:39 PM, Pythonnerpython...@gmail.com wrote: Hello, this is a follow-up of discussion started on twitter with http://twitter.com/phunt. I installed HBase 0.20.0 RC2 on Ubuntu server boxes. If I'm using machines IP in config files (see below), I get the following error message: 'Could not find my address: xyz in list of ZooKeeper quorum servers' message (where 'yxz' is a hostname) my config is: hbase-env.sh: export HBASE_MANAGES_ZK=true hbase-site.xml: configuration property namehbase.rootdir/name valuehdfs://192.168.1.xx:9200/hbase/value /property property namehbase.master/name value192.168.1.xx:6/value /property property namehbase.cluster.distributed/name valuetrue/value /property /configuration from vanilla Ubuntu server install, I removed the 127.0.1.1 line from /etc/hosts Is it supposed to work well with IP addresses only? David -- Balie - Baseline Information Extraction http://balie.sourceforge.net [Open Source ~ 100% Java ~ Using Weka ~ Multilingual]
Re: A question about Connection timed out and operation timeout
Hi Qian, it would good if you could create a jira for this: https://issues.apache.org/jira/browse/ZOOKEEPER include both the client logs and the server logs (for overlapping client/server time period where you see the problem). also the server config if you're using a quorum vs standalone. If you could also include some/all of the client side code you have that would be useful for us to review. are you doing anything in particular during this time period where you see a problem? perhaps load testing of the server? 2 sec timeout is pretty low. the client needs to send a heartbeat every 0.6 seconds (1/3 timeout), and if it doesn't hear back from the server in another 0.6 seconds it will close the connection and attempt to re-connect to a(nother) server. also if the server doesn't hear from the client every 2 seconds it will expire the session. in general this is ok, but if you have any network issues, or periodic heavy load on the server this could cause a problem. typically we suggest timeouts in the 20-30 second range, but we do have some ppl that run in 5-10 second range regularly (but they have solid lans with low latency connections) re the ephemeral node - did you try the suggestions I had in my previous email? (attached) (please include that on the jira as well) Patrick Qian Ye wrote: Hi Ben: I used multi-thread library, and the session timeout is set to 2000 when the zookeeper handler was initialized. On Thu, Aug 20, 2009 at 9:52 PM, Benjamin Reed br...@yahoo-inc.com wrote: are you using the single threaded or multithreaded C library? the exceeded deadline message means that our thread was supposed to get control after a certain period, but we got control that many milliseconds late. what is your session timeout? ben From: Qian Ye [yeqian@gmail.com] Sent: Thursday, August 20, 2009 3:17 AM To: zookeeper-user Subject: A question about Connection timed out and operation timeout Hi guys: I met the problem again: an ephemeral node disappeared, and I found it because my application got a operation timeout My application which created an ephemeral node at the zookeeper server, printed the following log *WARNING: 08-20 03:09:20: auto * 182894118176 [logid:][reqip:][auto_exchanger_zk_basic.cpp:605]get children fail.[/forum/elect_nodes][-7][operation timeout]* and the Zookeeper client printed the following log (the log level is INFO) 2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@545: Client environment:zookeeper.version=zookeeper C client 3.2.0 606 2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@549: Client environment:host.name=jx-ziyuan-test00.jx.baidu.com 607 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@557: Client environments.name=Linux 608 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@558: Client environments.arch=2.6.9-52bs 609 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@559: Client environments.version=#2 SMP Fri Jan 26 13:34:38 CST 2007 610 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@567: Client environment:user.name=club 611 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@577: Client environment:user.home=/home/club 612 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@589: Client environment:user.dir=/home/club/user/luhongbo/auto-exchanger 613 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@zookeeper_init @613: Initiating client connection, host=127.0.0.1:2181,127.0.0.1:2182sessionTimeout=2000 wa tcher=0x408c56 sessionId=0x0 sessionPasswd=null context=(nil) flags=0 614 2009-08-19 21:36:18,069:3813(0x41401960):zoo_i...@check_events @1439: initiated connection to server [127.0.0.1:2181] 615 2009-08-19 21:36:18,070:3813(0x41401960):zoo_i...@check_events @1484: connected to server [127.0.0.1:2181] with session id=1232c1688a20093 616 2009-08-20 02:48:01,780:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 520ms 617 2009-08-20 03:08:52,332:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 14ms 618 2009-08-20 03:09:04,666:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 48ms 619 2009-08-20 03:09:09,733:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 24ms 620 *2009-08-20 03:09:20,289:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 264ms* 621 *2009-08-20 03:09:20,295:3813(0x41401960):zoo_er...@handle_socket_error_msg@1388: Socket [127.0.0.1:2181] zk retcode=-7, errno=110(Connection timed out): conn ection timed out (exceeded timeout by 264ms)* 622 *2009-08-20 03:09:20,309:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 284ms* 623 *2009-08-20 03:09:20,309:3813(0x41401960):zoo_er...@handle_socket_error_msg@1433: Socket [127.0.0.1:2182] zk retcode=-4, errno=111(Connection refused): server refused to accept the client* 624 *2009-08-20
Re: Errors when run zookeeper in windows ?
David Bosschaert wrote: FWIW, I've uploaded some Windows versions of the zookeeper scripts to https://issues.apache.org/jira/browse/ZOOKEEPER-426 a while ago. They run from the ordinary windows shell, so no need for Cygwin or anything like that. I'm using Zookeeper from Windows all the time and they work fine for me. I did notice that the scripts didn't get included in the latest 3.2.0 release. It might be worth putting some Windows scripts in the next release as nothing in Zookeeper is unix specific (except for the scripts ;) Looks like it slipped through as it wasn't assigned to a particular release. I've updated the jira to list this for upcoming 3.3. Thanks, Patrick Best regards, David 2009/8/19 zhang jianfeng zjf...@gmail.com: Yes,I am using cygwin and JDK 1.6, the command to start HBase is the same as in the get started: bin/zkServer.sh start The following is the whole message: zjf...@zjf ~/zookeeper-3.1.1 $ *bin/zkServer.sh start* JMX enabled by default Starting zookeeper ... STARTED zjf...@zjf ~/zookeeper-3.1.1 $ java.lang.NoClassDefFoundError: Files\Java\jre6\lib\ext\QTJava/zip;D:\Java\lib\hadoop-0/18/0\build\tools:/home/zjffdu/zookeeper-3/1/1/binzookeeper-3/1/1/jar:/home/zjffdu/zookeeper-3/1/1/binlib/junit-4/4/jar:/home/zjffdu/zookeeper-3/1/1/binlib/log4j-1/2/15/jar:/home/zjffdu/zookeeper-3/1/1/binsrc/java/lib/junit-4/4/jar:/home/zjffdu/zookeeper-3/1/1/binsrc/java/lib/log4j-1/2/15/jar Caused by: java.lang.ClassNotFoundException: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:.home.zjffdu.zookeeper-3.1.1.binzookeeper-3.1.1.jar:.home.zjffdu.zookeeper-3.1.1.binlib.junit-4.4.jar:.home.zjffdu.zookeeper-3.1.1.binlib.log4j-1.2.15.jar:.home.zjffdu.zookeeper-3.1.1.binsrc.java.lib.junit-4.4.jar:.home.zjffdu.zookeeper-3.1.1.binsrc.java.lib.log4j-1.2.15.jar at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Could not find the main class: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:/home/zjffdu/zookeeper-3.1.1/bin/../zookeeper-3.1.1.jar:/home/zjffdu/zookeeper-3.1.1/bin/../lib/junit-4.4.jar:/home/zjffdu/zookeeper-3.1.1/bin/../lib/log4j-1.2.15.jar:/home/zjffdu/zookeeper-3.1.1/bin/../src/java/lib/junit-4.4.jar:/home/zjffdu/zookeeper-3.1.1/bin/../src/java/lib/log4j-1.2.15.jar. Program will exit. $ Thank you Jeff zhang On Tue, Aug 18, 2009 at 12:53 PM, Patrick Hunt ph...@apache.org wrote: you are using java 1.6 right? more detail on the class not found would be useful (is that missing or just not included in your email?) Also the command line you're using to start the app would be interesting. Patrick Mahadev Konar wrote: Hi Zhang, Are you using cygwin? mahadev On 8/17/09 11:23 PM, zhang jianfeng zjf...@gmail.com wrote: Hi all, I tried to run zookeeper in windows, but the following errors appears: /* ** * $ java.lang.NoClassDefFoundError: Files\Java\jre6\lib\ext\QTJava/zip;D:\Java\lib\hadoop-0/18/0\build\tools:/home /zjffdu/zookeeper-3/1/1/binzookeeper-3/1/1/jar:/home/zjffdu/zookeeper-3/1/ 1/binlib/junit-4/4/jar:/home/zjffdu/zookeeper-3/1/1/binlib/log4j-1/2/1 5/jar:/home/zjffdu/zookeeper-3/1/1/binsrc/java/lib/junit-4/4/jar:/home/zjf fdu/zookeeper-3/1/1/binsrc/java/lib/log4j-1/2/15/jar Caused by: java.lang.ClassNotFoundException: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:.home .zjffdu.zookeeper-3.1.1.binzookeeper-3.1.1.jar:.home.zjffdu.zookeeper-3.1. 1.binlib.junit-4.4.jar:.home.zjffdu.zookeeper-3.1.1.binlib.log4j-1.2.1 5.jar:.home.zjffdu.zookeeper-3.1.1.binsrc.java.lib.junit-4.4.jar:.home.zjf fdu.zookeeper-3.1.1.binsrc.java.lib.log4j-1.2.15.jar at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Could not find the main class: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:/home /zjffdu/zookeeper-3.1.1/bin/../zookeeper-3.1.1.jar:/home/zjffdu/zookeeper-3.1. 1/bin/../lib/junit-4.4.jar:/home/zjffdu/zookeeper-3.1.1/bin/../lib/log4j-1.2.1 5.jar:/home/zjffdu/zookeeper
Re: Errors when run zookeeper in windows ?
I suspect it has to do with the classpath - specifically having spaces in the directory name. Notice that one of the lines you included starts Files\Java\ - that probably should be ...\Program Files\Java\... and the space is causing problems. Try using David's dos specific file, or edit the start script (bin/zk*.sh) to put quotes around the classpath, like in zkServer.sh -cp $CLASSPATH $JVMFLAGS $ZOOMAIN $ZOOCFG change to -cp $CLASSPATH $JVMFLAGS $ZOOMAIN $ZOOCFG and see if that helps (you might have to play with it a bit, but I suspect this will work). zkEnv.sh you may need to put quotes as well: CLASSPATH=$ZOOCFGDIR:$CLASSPATH Patrick zhang jianfeng wrote: Yes,I am using cygwin and JDK 1.6, the command to start HBase is the same as in the get started: bin/zkServer.sh start The following is the whole message: zjf...@zjf ~/zookeeper-3.1.1 $ *bin/zkServer.sh start* JMX enabled by default Starting zookeeper ... STARTED zjf...@zjf ~/zookeeper-3.1.1 $ java.lang.NoClassDefFoundError: Files\Java\jre6\lib\ext\QTJava/zip;D:\Java\lib\hadoop-0/18/0\build\tools:/home/zjffdu/zookeeper-3/1/1/binzookeeper-3/1/1/jar:/home/zjffdu/zookeeper-3/1/1/binlib/junit-4/4/jar:/home/zjffdu/zookeeper-3/1/1/binlib/log4j-1/2/15/jar:/home/zjffdu/zookeeper-3/1/1/binsrc/java/lib/junit-4/4/jar:/home/zjffdu/zookeeper-3/1/1/binsrc/java/lib/log4j-1/2/15/jar Caused by: java.lang.ClassNotFoundException: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:.home.zjffdu.zookeeper-3.1.1.binzookeeper-3.1.1.jar:.home.zjffdu.zookeeper-3.1.1.binlib.junit-4.4.jar:.home.zjffdu.zookeeper-3.1.1.binlib.log4j-1.2.15.jar:.home.zjffdu.zookeeper-3.1.1.binsrc.java.lib.junit-4.4.jar:.home.zjffdu.zookeeper-3.1.1.binsrc.java.lib.log4j-1.2.15.jar at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Could not find the main class: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:/home/zjffdu/zookeeper-3.1.1/bin/../zookeeper-3.1.1.jar:/home/zjffdu/zookeeper-3.1.1/bin/../lib/junit-4.4.jar:/home/zjffdu/zookeeper-3.1.1/bin/../lib/log4j-1.2.15.jar:/home/zjffdu/zookeeper-3.1.1/bin/../src/java/lib/junit-4.4.jar:/home/zjffdu/zookeeper-3.1.1/bin/../src/java/lib/log4j-1.2.15.jar. Program will exit. $ Thank you Jeff zhang On Tue, Aug 18, 2009 at 12:53 PM, Patrick Hunt ph...@apache.org wrote: you are using java 1.6 right? more detail on the class not found would be useful (is that missing or just not included in your email?) Also the command line you're using to start the app would be interesting. Patrick Mahadev Konar wrote: Hi Zhang, Are you using cygwin? mahadev On 8/17/09 11:23 PM, zhang jianfeng zjf...@gmail.com wrote: Hi all, I tried to run zookeeper in windows, but the following errors appears: /* ** * $ java.lang.NoClassDefFoundError: Files\Java\jre6\lib\ext\QTJava/zip;D:\Java\lib\hadoop-0/18/0\build\tools:/home /zjffdu/zookeeper-3/1/1/binzookeeper-3/1/1/jar:/home/zjffdu/zookeeper-3/1/ 1/binlib/junit-4/4/jar:/home/zjffdu/zookeeper-3/1/1/binlib/log4j-1/2/1 5/jar:/home/zjffdu/zookeeper-3/1/1/binsrc/java/lib/junit-4/4/jar:/home/zjf fdu/zookeeper-3/1/1/binsrc/java/lib/log4j-1/2/15/jar Caused by: java.lang.ClassNotFoundException: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:.home .zjffdu.zookeeper-3.1.1.binzookeeper-3.1.1.jar:.home.zjffdu.zookeeper-3.1. 1.binlib.junit-4.4.jar:.home.zjffdu.zookeeper-3.1.1.binlib.log4j-1.2.1 5.jar:.home.zjffdu.zookeeper-3.1.1.binsrc.java.lib.junit-4.4.jar:.home.zjf fdu.zookeeper-3.1.1.binsrc.java.lib.log4j-1.2.15.jar at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Could not find the main class: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:/home /zjffdu/zookeeper-3.1.1/bin/../zookeeper-3.1.1.jar:/home/zjffdu/zookeeper-3.1. 1/bin/../lib/junit-4.4.jar:/home/zjffdu/zookeeper-3.1.1/bin/../lib/log4j-1.2.1 5.jar:/home/zjffdu/zookeeper-3.1.1/bin/../src/java/lib/junit-4.4.jar:/home/zjf fdu
Re: Errors when run zookeeper in windows ?
One more thing, please enter a jira on this so that we can track/fix it. https://issues.apache.org/jira/browse/ZOOKEEPER Thanks, Patrick Patrick Hunt wrote: I suspect it has to do with the classpath - specifically having spaces in the directory name. Notice that one of the lines you included starts Files\Java\ - that probably should be ...\Program Files\Java\... and the space is causing problems. Try using David's dos specific file, or edit the start script (bin/zk*.sh) to put quotes around the classpath, like in zkServer.sh -cp $CLASSPATH $JVMFLAGS $ZOOMAIN $ZOOCFG change to -cp $CLASSPATH $JVMFLAGS $ZOOMAIN $ZOOCFG and see if that helps (you might have to play with it a bit, but I suspect this will work). zkEnv.sh you may need to put quotes as well: CLASSPATH=$ZOOCFGDIR:$CLASSPATH Patrick zhang jianfeng wrote: Yes,I am using cygwin and JDK 1.6, the command to start HBase is the same as in the get started: bin/zkServer.sh start The following is the whole message: zjf...@zjf ~/zookeeper-3.1.1 $ *bin/zkServer.sh start* JMX enabled by default Starting zookeeper ... STARTED zjf...@zjf ~/zookeeper-3.1.1 $ java.lang.NoClassDefFoundError: Files\Java\jre6\lib\ext\QTJava/zip;D:\Java\lib\hadoop-0/18/0\build\tools:/home/zjffdu/zookeeper-3/1/1/binzookeeper-3/1/1/jar:/home/zjffdu/zookeeper-3/1/1/binlib/junit-4/4/jar:/home/zjffdu/zookeeper-3/1/1/binlib/log4j-1/2/15/jar:/home/zjffdu/zookeeper-3/1/1/binsrc/java/lib/junit-4/4/jar:/home/zjffdu/zookeeper-3/1/1/binsrc/java/lib/log4j-1/2/15/jar Caused by: java.lang.ClassNotFoundException: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:.home.zjffdu.zookeeper-3.1.1.binzookeeper-3.1.1.jar:.home.zjffdu.zookeeper-3.1.1.binlib.junit-4.4.jar:.home.zjffdu.zookeeper-3.1.1.binlib.log4j-1.2.15.jar:.home.zjffdu.zookeeper-3.1.1.binsrc.java.lib.junit-4.4.jar:.home.zjffdu.zookeeper-3.1.1.binsrc.java.lib.log4j-1.2.15.jar at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Could not find the main class: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:/home/zjffdu/zookeeper-3.1.1/bin/../zookeeper-3.1.1.jar:/home/zjffdu/zookeeper-3.1.1/bin/../lib/junit-4.4.jar:/home/zjffdu/zookeeper-3.1.1/bin/../lib/log4j-1.2.15.jar:/home/zjffdu/zookeeper-3.1.1/bin/../src/java/lib/junit-4.4.jar:/home/zjffdu/zookeeper-3.1.1/bin/../src/java/lib/log4j-1.2.15.jar. Program will exit. $ Thank you Jeff zhang On Tue, Aug 18, 2009 at 12:53 PM, Patrick Hunt ph...@apache.org wrote: you are using java 1.6 right? more detail on the class not found would be useful (is that missing or just not included in your email?) Also the command line you're using to start the app would be interesting. Patrick Mahadev Konar wrote: Hi Zhang, Are you using cygwin? mahadev On 8/17/09 11:23 PM, zhang jianfeng zjf...@gmail.com wrote: Hi all, I tried to run zookeeper in windows, but the following errors appears: /* ** * $ java.lang.NoClassDefFoundError: Files\Java\jre6\lib\ext\QTJava/zip;D:\Java\lib\hadoop-0/18/0\build\tools:/home /zjffdu/zookeeper-3/1/1/binzookeeper-3/1/1/jar:/home/zjffdu/zookeeper-3/1/ 1/binlib/junit-4/4/jar:/home/zjffdu/zookeeper-3/1/1/binlib/log4j-1/2/1 5/jar:/home/zjffdu/zookeeper-3/1/1/binsrc/java/lib/junit-4/4/jar:/home/zjf fdu/zookeeper-3/1/1/binsrc/java/lib/log4j-1/2/15/jar Caused by: java.lang.ClassNotFoundException: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:.home .zjffdu.zookeeper-3.1.1.binzookeeper-3.1.1.jar:.home.zjffdu.zookeeper-3.1. 1.binlib.junit-4.4.jar:.home.zjffdu.zookeeper-3.1.1.binlib.log4j-1.2.1 5.jar:.home.zjffdu.zookeeper-3.1.1.binsrc.java.lib.junit-4.4.jar:.home.zjf fdu.zookeeper-3.1.1.binsrc.java.lib.log4j-1.2.15.jar at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Could not find the main class: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:/home /zjffdu/zookeeper-3.1.1/bin/../zookeeper
Re: Errors when run zookeeper in windows ?
you are using java 1.6 right? more detail on the class not found would be useful (is that missing or just not included in your email?) Also the command line you're using to start the app would be interesting. Patrick Mahadev Konar wrote: Hi Zhang, Are you using cygwin? mahadev On 8/17/09 11:23 PM, zhang jianfeng zjf...@gmail.com wrote: Hi all, I tried to run zookeeper in windows, but the following errors appears: /* ** * $ java.lang.NoClassDefFoundError: Files\Java\jre6\lib\ext\QTJava/zip;D:\Java\lib\hadoop-0/18/0\build\tools:/home /zjffdu/zookeeper-3/1/1/binzookeeper-3/1/1/jar:/home/zjffdu/zookeeper-3/1/ 1/binlib/junit-4/4/jar:/home/zjffdu/zookeeper-3/1/1/binlib/log4j-1/2/1 5/jar:/home/zjffdu/zookeeper-3/1/1/binsrc/java/lib/junit-4/4/jar:/home/zjf fdu/zookeeper-3/1/1/binsrc/java/lib/log4j-1/2/15/jar Caused by: java.lang.ClassNotFoundException: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:.home .zjffdu.zookeeper-3.1.1.binzookeeper-3.1.1.jar:.home.zjffdu.zookeeper-3.1. 1.binlib.junit-4.4.jar:.home.zjffdu.zookeeper-3.1.1.binlib.log4j-1.2.1 5.jar:.home.zjffdu.zookeeper-3.1.1.binsrc.java.lib.junit-4.4.jar:.home.zjf fdu.zookeeper-3.1.1.binsrc.java.lib.log4j-1.2.15.jar at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Could not find the main class: Files\Java\jre6\lib\ext\QTJava.zip;D:\Java\lib\hadoop-0.18.0\build\tools:/home /zjffdu/zookeeper-3.1.1/bin/../zookeeper-3.1.1.jar:/home/zjffdu/zookeeper-3.1. 1/bin/../lib/junit-4.4.jar:/home/zjffdu/zookeeper-3.1.1/bin/../lib/log4j-1.2.1 5.jar:/home/zjffdu/zookeeper-3.1.1/bin/../src/java/lib/junit-4.4.jar:/home/zjf fdu/zookeeper-3.1.1/bin/../src/java/lib/log4j-1.2.15.jar. Program will exit. /* ** * It looks like my JAVA_HOME is not set correctly, anyone have any ideas? Thank you Jeff zhang
Re: c client error message with chroot
Please do enter a JIRA. Looking at the source it seems that we log and error, but the calling code continues. I think this is happening because the chroot c lib code is not handling znode watches separate from state change notifications. The calling code just continues after logging an (invalid I think) error - can you try it out and see if it works, even though the error message is being displayed (with chroot I mean). Thanks. Patrick Mahadev Konar wrote: This looks like a bug. Does this happen without doing any reads/writes using the zookeeper handle? Please do open a jira for this. Thanks mahadev On 8/2/09 10:53 PM, Michi Mutsuzaki mi...@cs.stanford.edu wrote: Hello, I'm doing something like this (using zookeeper-3.2.0): zhandle_t* zh = zookeeper_init(localhost:2818/servers, watcher, 1000, 0, 0, 0); and getting this error: 2009-08-03 05:48:30,693:3380(0x40a04950):zoo_i...@check_events@1439: initiated connection to server [127.0.0.1:2181] 2009-08-03 05:48:30,705:3380(0x40a04950):zoo_i...@check_events@1484: connected to server [127.0.0.1:2181] with session id=122ddb9be64016d 2009-08-03 05:48:30,705:3380(0x40c05950):zoo_er...@sub_string@730: server path does not include chroot path /servers The error log doesn't appear if I use localhost:2818 without chroot. Is this actually an error? Thanks! --Michi
Re: test failures in branch-3.2
Hi Todd, Sorry for the clutter/confusion. Usually things aren't this cumbersome ;-) In particular: 1 committer is on vacation Mahadev's been out sick for multiple days I'm sick but trying to hang in there, but def not 100% Hudson (CI) has been offline for effectively the past 3 weeks (that gates all our commits) and is just now back but flaky. 3.2 had some bugs that we are trying to address, but the afore mentioned issues are slowing us down. Otw we'd have all this straightened out by now At this point you should move this discussion to the dev list - Apache doesn't really like us to discuss code changes/futures here (user list). On that list you'll also see the plan for upcoming releases - I mention all this because we are actively working toward 3.2.1 which will include the JIRAs slated for that release (I'm sure you've seen). If you can wait a bit you might be able to avoid some pain by using the upcoming 3.2.1 release. Once the patches land into that branch your issues will be resolved w/o you needing to manually apply patches, etc... I did look at the files you attached - it looks fine so I'm not sure the issue. The form of this test makes it harder - we are verifying that the log contains sufficient information when a particular error occurs. We fiddle with log4j in order to do this, which means that the log you are including doesn't specify the problem. Try instrumenting this test with a try/catch around the content of the test method (all the code in the failing method inside a big try/catch is what I mean). Then print the error to std out as part of the catch. That should shed some light. If you could debug it a bit that would help - because we aren't seeing this in our environment. Again, sort of a moot point if you can wait a week or so... Regards, Patrick Todd Greenwood wrote: Inline. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 10:57 PM To: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 Todd Greenwood wrote: Starting w/ branch-3.2 (no changes) I applied patches in this order: 1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest fails. 2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file - PortAssignment.java. PortAssignment.java was added by Patrick as part of ZOOKEEPER-473.patch, which is a pretty hefty patch ( 2k lines) and touches a large number of files. Hrm, those patches were probably created against the trunk. We'll have to have separate patches for trunk and 3.2 branch on 481. If you could update the jira with this detail (481 needs two patches, one for each branch) that would be great! Done. 3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails (jvm crashes). 473 is special (unique) in the sense that it changes log4j while the the vm is running. In general though it's a pretty boring test and shouldn't be failing. Are you sure you have the right patch file? there are 2 patch files on the JIRA for 473, make sure that you have the one from 7/16, NOT the one from 7/15. Check that the patch file, the correct one should NOT contain changes to build.xml or conf/log4j* files. If this still happens send me your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email for review. I'll take a look. I've annotated the files w/ their date while downloading: 112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch 110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch It appears I applied the 7-16 patch, as that is the matching file size of the patch file I applied. If there are to be multiple patch files for multiple branches (3.2, trunk, etc.) would it make sense to lable the patch files accordingly? Requested files in attached tar. -Todd Patrick [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest FAILED (crashed) Test Log Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec Testcase: testBadPeerAddressInQuorum took 0.004 sec Caused an ERROR Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. -Todd -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 10:13 PM To: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 Todd Greenwood wrote: [Todd] Yes, I believe address in use was the problem w/ FLETest. I assumed it was a timing issue w/ respect to test A not fully
Re: test failures in branch-3.2
Todd Greenwood wrote: On a plus note, I'm finding that this morning, @work rather than @home, the tests continue to completion. However, there are other issues that I'll bring up on the dev list, such as a requirement to have autoconf installed, and problems in the create-cppunit-configure task that can't exec libtoolize, fun stuff like tha. Great, good to hear. At some point figuring out what's up with your @home would be interesting to us. :-) Yes, there are some basic requirements such as autotool, cppunit, etc... but please do raise all this on the dev list. I need to proceed with the manual patches to branch-3.2, as I am under some time constraints to get our infrastructure deployed such that QA can start playing with it. However, I'll switch to 3.2.1 as soon as I can. Understood. Patrick -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Friday, July 31, 2009 11:38 AM To: zookeeper-user@hadoop.apache.org; Todd Greenwood Subject: Re: test failures in branch-3.2 Hi Todd, Sorry for the clutter/confusion. Usually things aren't this cumbersome ;-) In particular: 1 committer is on vacation Mahadev's been out sick for multiple days I'm sick but trying to hang in there, but def not 100% Hudson (CI) has been offline for effectively the past 3 weeks (that gates all our commits) and is just now back but flaky. 3.2 had some bugs that we are trying to address, but the afore mentioned issues are slowing us down. Otw we'd have all this straightened out by now At this point you should move this discussion to the dev list - Apache doesn't really like us to discuss code changes/futures here (user list). On that list you'll also see the plan for upcoming releases - I mention all this because we are actively working toward 3.2.1 which will include the JIRAs slated for that release (I'm sure you've seen). If you can wait a bit you might be able to avoid some pain by using the upcoming 3.2.1 release. Once the patches land into that branch your issues will be resolved w/o you needing to manually apply patches, etc... I did look at the files you attached - it looks fine so I'm not sure the issue. The form of this test makes it harder - we are verifying that the log contains sufficient information when a particular error occurs. We fiddle with log4j in order to do this, which means that the log you are including doesn't specify the problem. Try instrumenting this test with a try/catch around the content of the test method (all the code in the failing method inside a big try/catch is what I mean). Then print the error to std out as part of the catch. That should shed some light. If you could debug it a bit that would help - because we aren't seeing this in our environment. Again, sort of a moot point if you can wait a week or so... Regards, Patrick Todd Greenwood wrote: Inline. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 10:57 PM To: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 Todd Greenwood wrote: Starting w/ branch-3.2 (no changes) I applied patches in this order: 1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest fails. 2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file - PortAssignment.java. PortAssignment.java was added by Patrick as part of ZOOKEEPER-473.patch, which is a pretty hefty patch ( 2k lines) and touches a large number of files. Hrm, those patches were probably created against the trunk. We'll have to have separate patches for trunk and 3.2 branch on 481. If you could update the jira with this detail (481 needs two patches, one for each branch) that would be great! Done. 3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails (jvm crashes). 473 is special (unique) in the sense that it changes log4j while the the vm is running. In general though it's a pretty boring test and shouldn't be failing. Are you sure you have the right patch file? there are 2 patch files on the JIRA for 473, make sure that you have the one from 7/16, NOT the one from 7/15. Check that the patch file, the correct one should NOT contain changes to build.xml or conf/log4j* files. If this still happens send me your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email for review. I'll take a look. I've annotated the files w/ their date while downloading: 112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch 110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch It appears I applied the 7-16 patch, as that is the matching file size of the patch file I applied. If there are to be multiple patch files for multiple branches (3.2, trunk, etc.) would it make sense to lable the patch files accordingly? Requested files in attached tar. -Todd Patrick [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Running
Re: test failures in branch-3.2
btw QuorumPeerMainTest uses the CONSOLE appender which is setup in conf/log4j.properties, now that I think of it perhaps not such a good idea :-) If you edited cong/log4j.properties it may be causing the test to fail, did you do this? (if you run the test by itself using -Dtestcase does it always fail?) I've entered a jira to address this: https://issues.apache.org/jira/browse/ZOOKEEPER-492 Patrick Patrick Hunt wrote: Todd Greenwood wrote: The build succeeds, but not the all of the tests. In previous test runs, I noticed an error in org.apache.zookeeper.test.FLETest. It was not able to bind to a port or something. Now, after a machine reboot, I'm getting different failures. address in use? That's a problem in the test framework pre-3.3. In 3.3 (current svn trunk) I fixed it but it's not in 3.2.x. This is a problem with the test framework though and not a real problem, it shows up occasionally (depends on timing). branch-3.2 $ ant test [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest FAILED (crashed) [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED Test logs for these two tests attached. This is unusual though - looking at the log it seems that the JVM itself crashed for the QPMainTest! for HQT we are seeing: junit.framework.AssertionFailedError: Threads didn't join which Flavio mentioned to me once is possible to happen but not a real problem (he can elaborate). What version of java are you using? OS, other environment that might be interesting? (vm? etc...) You might try looking at the jvm crash dump file (I think it's in /tmp) If you run each of these two tests individually do they run? example: ant -Dtestcase=FLENewEpochTest test-core-java My goal here is to get to a known state (all tests succeeding or have workarounds for the failures). Following that, I plan to apply the patches Flavio recommended for a WAN deploy (479 and 481). After I verify that the tests continue to run, I'll package this up and deploy it to our WAN for testing. Sounds like a good plan. So, are these known issues? Do the tests normally run en masse, or do some of the tests hold on to resources and prevent other tests from passing? Typically they do run to completion, but occasionally on my machine (java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some random failure due to address in use, or the same didn't join that you saw. Usually I see this if I'm multitasking (vs just letting the tests run w/o using the box). As I said this is addressed in 3.3 (address reuse at the very least, and I haven't see the other issues). Patrick
Re: test failures in branch-3.2
well try running these two tests individually and see if they always fail or just occassionally. that will be a good start (and the env detail). Patrick Todd Greenwood wrote: No edits to conf/log4j.properties. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 9:25 PM To: Patrick Hunt Cc: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 btw QuorumPeerMainTest uses the CONSOLE appender which is setup in conf/log4j.properties, now that I think of it perhaps not such a good idea :-) If you edited cong/log4j.properties it may be causing the test to fail, did you do this? (if you run the test by itself using -Dtestcase does it always fail?) I've entered a jira to address this: https://issues.apache.org/jira/browse/ZOOKEEPER-492 Patrick Patrick Hunt wrote: Todd Greenwood wrote: The build succeeds, but not the all of the tests. In previous test runs, I noticed an error in org.apache.zookeeper.test.FLETest. It was not able to bind to a port or something. Now, after a machine reboot, I'm getting different failures. address in use? That's a problem in the test framework pre-3.3. In 3.3 (current svn trunk) I fixed it but it's not in 3.2.x. This is a problem with the test framework though and not a real problem, it shows up occasionally (depends on timing). branch-3.2 $ ant test [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest FAILED (crashed) [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED Test logs for these two tests attached. This is unusual though - looking at the log it seems that the JVM itself crashed for the QPMainTest! for HQT we are seeing: junit.framework.AssertionFailedError: Threads didn't join which Flavio mentioned to me once is possible to happen but not a real problem (he can elaborate). What version of java are you using? OS, other environment that might be interesting? (vm? etc...) You might try looking at the jvm crash dump file (I think it's in /tmp) If you run each of these two tests individually do they run? example: ant -Dtestcase=FLENewEpochTest test-core-java My goal here is to get to a known state (all tests succeeding or have workarounds for the failures). Following that, I plan to apply the patches Flavio recommended for a WAN deploy (479 and 481). After I verify that the tests continue to run, I'll package this up and deploy it to our WAN for testing. Sounds like a good plan. So, are these known issues? Do the tests normally run en masse, or do some of the tests hold on to resources and prevent other tests from passing? Typically they do run to completion, but occasionally on my machine (java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some random failure due to address in use, or the same didn't join that you saw. Usually I see this if I'm multitasking (vs just letting the tests run w/o using the box). As I said this is addressed in 3.3 (address reuse at the very least, and I haven't see the other issues). Patrick
Re: test failures in branch-3.2
Todd Greenwood wrote: Starting w/ branch-3.2 (no changes) I applied patches in this order: 1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest fails. 2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file - PortAssignment.java. PortAssignment.java was added by Patrick as part of ZOOKEEPER-473.patch, which is a pretty hefty patch ( 2k lines) and touches a large number of files. Hrm, those patches were probably created against the trunk. We'll have to have separate patches for trunk and 3.2 branch on 481. If you could update the jira with this detail (481 needs two patches, one for each branch) that would be great! 3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails (jvm crashes). 473 is special (unique) in the sense that it changes log4j while the the vm is running. In general though it's a pretty boring test and shouldn't be failing. Are you sure you have the right patch file? there are 2 patch files on the JIRA for 473, make sure that you have the one from 7/16, NOT the one from 7/15. Check that the patch file, the correct one should NOT contain changes to build.xml or conf/log4j* files. If this still happens send me your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email for review. I'll take a look. Patrick [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest FAILED (crashed) Test Log Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec Testcase: testBadPeerAddressInQuorum took 0.004 sec Caused an ERROR Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. -Todd -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 10:13 PM To: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 Todd Greenwood wrote: [Todd] Yes, I believe address in use was the problem w/ FLETest. I assumed it was a timing issue w/ respect to test A not fully releasing resources before test B started. Might be, but actually I think it's related to this: http://hea-www.harvard.edu/~fine/Tech/addrinuse.html Patrick
Re: Zookeeper WAN Configuration
Flavio, please enter a doc jira for this if there are no docs, it should be in forrest, not twiki btw. It would be good if you could review the current quorum docs (any type) and create a jira/patch that addresses any/all shortfall. Patrick Flavio Junqueira wrote: Todd, Some more answers. Please check out carefully the information at the bottom of this message. On Jul 27, 2009, at 4:02 PM, Todd Greenwood wrote: I'm assuming that you're setting the weight of ZooKeeper servers in PODs to zero, which means that their votes when ordering updates do not count. [Todd] Correct. If my assumption is correct, then you should see a significant improvement in read performance. I would say that write performance wouldn't be very different from clients in PODs opening a direct connection to DC. [Todd] So the Leader, knowing that machine(s) have a voting weight of zero, doesn't have to wait for their responses in order to form a quorum vote? Does the leader even send voting requests to the weight zero followers? In the current implementation, it does. When we have observers implemented, the leader won't do it. 3. ZK Servers within the POD would be resilient to network connectivity failure between the POD and the DC. Once connectivity re-established, the ZK Servers in the POD would sync with the ZK servers in the DC, and, from the perspective of a client within the POD, everything just worked, and there was no network failure. We want to have servers switching to read-only mode upon network partitions, but this is a feature under development. We don't have plans for implementing any model of eventual consistency that would allow updates even when not being able to form a quorum, and I personally believe that it would be a major change, with major implications not only to the code base, but also to the semantics of our API. [Todd] What is the current (3.2) behaviour in the case of a network failure that prevents connectivity between ZK Servers in a pod? Assuming the pod is composed of weight=0 followers...are the clients connected to these zookeeper servers still able to read? do they get exceptions on write? do the clients hang if it's a synchronous call? The clients won't be able to read because we don't have this feature of going read-only upon partitions. 4. A WAN topology of co-located ZK servers in both the DC and (n) PODs would not significantly degrade the performance of the ensemble, provided large blobs of traffic were not being sent across the network. If the zk servers in the PODs are assigned weight zero, then I don't see a reason for having lower performance in the scenario you describe. If weights are greater than zero for zk servers in PODs, then your performance might be affected, but there are ways of assigning weights that do not require receiving votes from all co- locations for progress. [Todd] Great, we'll proceed with hierarchical configuration w/ ZK Servers in pods having a voting weight of zero. Could you provide a pointer to a configuration that shows this? The docs are a bit lean in this regard... We should have a twiki page on this. For now, you can find an example in the header of QuorumHierarchical.java. Also, I found a couple of bugs recently that may or may not affect your setup, so I suggest that you apply the patches in ZOOKEEPER-481 and ZOOKEEPER-479. We would like to have these patches in for the next release (3.2.1), which should be out in two or three weeks, if there is no further complication. Another issue that I realized that won't work in your case, but the fix would be relatively easy, is the guarantee that no zero-weight follower will be elected. Currently, we don't check the weight during leader election. I'll open a jira and put up a patch soon. -Flavio
Re: Queue code
Thanks for the report, looks like something we need to address, would you mind going the extra step and adding a JIRA on this? https://issues.apache.org/jira/browse/ZOOKEEPER Thanks, Patrick kishore g wrote: Hi All, Zookeeper recipe queue code has a bug. byte[] b = zk.getData(root + /element + min, false, stat); zk.delete(root + /element + min, 0); It throws an error saying the node element0 does not exists. Node actually created by producer was element00. So along with min minNodeName must be stored. Here is the consume method that works. public int consume() throws KeeperException, InterruptedException{ int retvalue = -1; Stat stat = null; // Get the first element available while (true) { synchronized (mutex) { ListString list = zk.getChildren(root, true); if (list.size() == 0) { System.out.println(Going to wait); mutex.wait(); } else { Integer min = new Integer(list.get(0).substring(7)); String name = list.get(0); for(String s : list){ Integer tempValue = new Integer(s.substring(7)); //System.out.println(Temporary value: + s); if(tempValue min) { min = tempValue; name = s; } } String zNode = root + / + name; System.out.println(Temporary value: + zNode); byte[] b = zk.getData(zNode,false, stat); zk.delete(zNode, 0); ByteBuffer buffer = ByteBuffer.wrap(b); retvalue = buffer.getInt(); return retvalue; } } } } Also are there any performance numbers of zookeeeper based queues. How does it compare with JMS. thanks Kishore G
Re: Question about the sequential flag on create.
Nodes are maintained un-ordered on the server. A node can store any subnodes, not exclusively sequential nodes. If we added an ordering guarantee then then server would have to store the children sorted for every parent node. This is a problem for a few reasons; 1) in many cases you don't care about order, so all users would pay for ordering even if they didn't want/use it, 2) the ordering would be done by the central server, which would result in lower performance for everyone, not just the client(s)/recipe(s) that needed ordering, 3) there is no guarantee that the ordering you need (path) is the same order needed by all recipes. Patrick Erik Holstad wrote: Hi Mahadev! Yeah kinda, what I was looking for was some kind of explanation of why this is, since they are stored in a list and it seems like new children would just be appended to the list. So I guess my question should have been more along the lines of something like: What is it internally that causes new nodes not the be inserted in order? What causes the lag from getting the sequence number till putting it into the list? Or is this not at all how this works? Regards Erik
Re: Read-your-writes consistency?
Yes, this is a strong guarantee: http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkGuarantees Sync is only necessary if client A makes a change, then client B wishes to read that change with guarantee that it will see the successfully applied change previously made by A (this is typically only an issue if A B communicate through some non-zk channel, such as direct socket connection btw A B). Patrick Marc Frei wrote: Dear ZooKeeper users, a short question about ZooKeeper's consistency model: Does ZooKeeper support read-your-writes consistency? E.g., after having created a node, will the same client always find the newly created node via getChildren? If yes, is this specified behavior the client can rely on? Or is it better to always call sync in order to enforce consistency? Thanks in advance and kind regards, Marc
Re: Instantiating HashSet for DataNode?
Erik, if you'd like enter a JIRA and take a whack at it go ahead. Perhaps a subclass of DataNode specific for ephemerals? That way it can handle any particulars - and should also minimize the number of if(children==null) type checks that would be needed. (don't neg. impact performance or b/w compat though). You would probably learn more about zk internals, testing, etc... in the process. http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute Patrick Mahadev Konar wrote: Hi Erik, I am not sure if that would a considerable opitmization but even if you wanted to do it, it would be much more than just adding a check in the constructor (the serialization/deserialization would need to have specialized code). Right now all the datanodes are treated equally for ser/derser and other purposes. mahadev On 7/14/09 1:42 PM, Erik Holstad erikhols...@gmail.com wrote: I'm not sure if I've miss read the code for the DataNode, but to me it looks like every node gets a set of children even though it might be an ephemeral node which cannot have children, so we are wasting 240 B for every one of those. Not sure if it makes a big difference, but just thinking that since everything sits in memory and there is no reason to instantiate it, maybe it would be possible just to add a check in the constructor? Regards Erik
[ANNOUNCE] Apache ZooKeeper 3.2.0
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version 3.2.0. ZooKeeper is a high-performance coordination service for distributed applications. It exposes common services - such as naming, configuration management, synchronization, and group services - in a simple interface so you don't have to write them from scratch. You can use it off-the-shelf to implement consensus, group management, leader election, and presence protocols. And you can build on it for your own, specific needs. Key features of the 3.2.0 release: * client side bindings for; Perl, Python, REST * flexible quorum support * re-usable recipe code libraries * chroot support in connect string * many fixes, improvements, improved documentation, etc... A number of optimizations have gone into this release and our benchmarks show that read/write performance of version 3.2.0 is approximately twice that of the previous 3.1.0 version! For ZooKeeper release details and downloads, visit: http://hadoop.apache.org/zookeeper/releases.html ZooKeeper 3.2.0 Release Notes are at: http://hadoop.apache.org/zookeeper/docs/r3.2.0/releasenotes.html Regards, The ZooKeeper Team
Re: zookeeper on ec2
Henry Robinson wrote: Effectively, EC2 does not introduce any new failure modes but potentially exacerbates some existing ones. If a majority of EC2 nodes fail (in the sense that their hard drive images cannot be recovered), there is no way to restart the cluster, and persistence is lost. As you say, this is highly unlikely. If, for some reason, the quorums are set such that only a single node failure could bring down the quorum (bad design, but plausible), this failure is more likely. This is not strictly true. The cluster cannot recover _automatically_ if failures n, where ensemble size is 2n+1. However you can recover manually as long as at least 1 snap and trailing logs can be recovered. We can even recover if the latest snapshots are corrupted, as long as we can recover a snap from some previous time t and all logs subsequent to t. EC2 just ups the stakes - crash failures are now potentially more dangerous (bugs, packet corruption, rack local hardware failures etc all could cause crash failures). It is common to assume that, notwithstanding a significant physical event that wipes a number of hard drives, writes that are written stay written. This assumption is sometimes false given certain choices of filesystem. EC2 just gives us a few more ways for that not to be true. I think it's more possible than one might expect to have a lagging minority left behind - say they are partitioned from the majority by a malfunctioning switch. They might all be lagging already as a result. Care must be taken not to bring up another follower on the minority side to make it a majority, else there are split-brain issues as well as the possibility of lost transactions. Again, not *too* likely to happen in the wild, but these permanently running services have a nasty habit of exploring the edge cases... To be explicit, you can cause any ZK cluster to back-track in time by doing the following: ... f) add new members of the cluster Which is why care needs to be taken that the ensemble can't be expanded with a current quorum. Dynamic membership doesn't save us when a majority fails - the existence of a quorum is a liveness condition for ZK. To help with the liveness issue we can sacrifice a little safety (see, e.g. vector clock ordered timestamps in Dynamo), but I think that ZK is aimed at safety first, liveness second. Not that you were advocating changing that, I'm just articulating why correctness is extremely important from my perspective. Henry At this point, you will have lost the transactions from (b), but I really, really am not going to worry about this happening either by plan or by accident. Without steps (e) and (f), the cluster will tell you that it knows something is wrong and that it cannot elect a leader. If you don't have *exact* coincidence of the survivor set and the set of laggards, then you won't have any data loss at all. You have to decide if this is too much risk for you. My feeling is that it is OK level of correctness for conventional weapon fire control, but not for nuclear weapons safeguards. Since my apps are considerably less sensitive than either of those, I am not much worried. On Mon, Jul 6, 2009 at 12:40 PM, Henry Robinson he...@cloudera.com wrote: It seems like there is a correctness issue: if a majority of servers fail, with the remaining minority lagging the leader for some reason, won't the ensemble's current state be forever lost?
Re: ZK quota
Do we have a JIRA for this? If not we should add one for 3.3. Patrick Mahadev Konar wrote: Hi Raghu, We do have plans to enforce quota in future. Enforcing requires some more work then just reporting. Reporting is a good enough tool for operations to manage a zookeeper cluster but we would certainly like to enforce it in the near future. Thanks mahadev On 6/18/09 7:01 PM, rag...@yahoo.com rag...@yahoo.com wrote: Is there a reason why node count/byte quota is not actually enforced but rather ZK just warns? Are there any plans to enforce the quota in a future release? Thanks Raghu
Show your ZooKeeper pride!
The Hadoop summit is Wednesday. If you're attending please feel free to say hi -- Mahadev is presenting @4, Ben and I will be attending as well. Also, regardless of whether you're attending or not we'd appreciate any updates to the powered by page, if you're too busy to update it yourself send us a snippet and we'll update it for you ;-) http://wiki.apache.org/hadoop/ZooKeeper/PoweredBy Regards, Patrick
Re: ConnectionLoss (node too big?)
Agree, created a new JIRA for this: https://issues.apache.org/jira/browse/ZOOKEEPER-430 See the following JIRA for one example why not to do this: https://issues.apache.org/jira/browse/ZOOKEEPER-327 In general you don't want to create large node sizes since all of the data/nodes are stored in memory by all of the servers. The latency issue is also a factor. However if you are storing a handful of nodes in the cluster then obv these aren't much of a problem (could bite you at some point in the future though if you start using ZK more...) In general we advise ppl to store tokens in ZK, so perhaps you might store the 7mb of data in a data store (filesystem?), and use ZK to coordinate access to that data (this is similar for example to how AWS does things with S3 and SQS, SQS has a limit of 8k iirc, so you store the task in SQS which includes a pointer (url) to the data to be acted upon in S3...) Patrick Eric Bowman wrote: Ted Dunning wrote: Isn't the max file size a megabyte? On Wed, Jun 3, 2009 at 9:01 AM, Eric Bowman ebow...@boboco.ie wrote: On the client, I see this when trying to write a node with 7,641,662 bytes: Ok, indeed, from http://hadoop.apache.org/zookeeper/docs/r3.0.1/zookeeperAdmin.html#sc_configuration I see: jute.maxbuffer: (Java system property:* jute.maxbuffer*) This option can only be set as a Java system property. There is no zookeeper prefix on it. It specifies the maximum size of the data that can be stored in a znode. The default is 0xf, or just under 1M. If this option is changed, the system property must be set on all servers and clients otherwise problems will arise. This is really a sanity check. ZooKeeper is designed to store data on the order of kilobytes in size. A more helpful exception would be nice :) Anybody have any experience popping this up a bit bigger? What kind of bad things happen? Thanks, Eric
Re: NodeChildrenChanged WatchedEvent
Javier, also note that the subsequent getChildren you mention in your original email is usually not entirely superfluous given that you generally want to watch the parent node for further changes, and a getChildren is required to set that watch. Patrick Benjamin Reed wrote: i'm adding a faq on this right now. it's a rather common request. we could put in the name of the node that is changing. indeed, we did in the first cut of zookeeper, but then we found that every instance of programs that used this resulted in bugs, so we removed it. here is the problem: you do a getChildren(), an event comes in that foo is deleted, and right afterwords goo gets deleted, but you aren't going to get that event since the previous delete fired and you haven't done another getChildren(). this almost always results in an error, so much so that we don't even give people the rope. ben Javier Vegas wrote: Hi, I am starting to implement Zookeeper as an arbiter for a high performance client-server service, it is working really well but I have a question. When my Watcher receives an event of NodeChildrenChanged event, is there any way of getting from the event the path for the child that changed? The WatchedEvent javadoc says that it includes exactly what happened but all I am able to extract is a vague NodeChildrenChanged type. What I am doing now to figure out the path of teh new child is to do a new getChildren and compare the new children list with the old children list, but that seems a waste of time and bandwith if my node has lots of children and is watched by a loot of zookeepers (which will be in prod). If I can somehow get the path of the added/deleted child from the WatchedEvent, it will make my life easier and my Zookeeper-powered system much more simple, robust and scalable. Any suggestions? Thanks, Javier Vegas
Re: Some one send me some demo of programming with C client API for Zookeeper
What specific questions re the docs did you have? It will give us some insight wrt prioritization. Please enter JIRAs if you have specific areas of interest. Patrick Qian Ye wrote: Thanks, btw, there are a lot of '[tbd]' in the zookeeperProgrammers, could you please fill in the blanks some time? It would really help me much, 3x~ On Fri, Apr 17, 2009 at 1:20 AM, Patrick Hunt ph...@apache.org wrote: You can generate the doxygen C API docs using make doxygen-doc (see the README). Mahadev Konar wrote: Please take a look at src/c/src/cli.c for some examples on zookeeper c client usage. Also you can see the test cases. Also http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperProgrammers.html Will give you some exmaple code for c clients. mahadev On 4/16/09 2:30 AM, Qian Ye yeqian@gmail.com wrote: Hi all: I'm a fresh man to Zookeeper. Finding that the documents at zookeeper.hadoop.apache.org are mostly about Java client API. However, I want some c client code to get start. Anyone could help me?
Re: problems on EC2?
Take a look at this section to start: http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_commonProblems What type of monitoring are you doing on your cluster? You could monitor at both the host and at the java (jmx) level. That will give you some insight on where to look; cpu, memory, disk, network, etc... Also the ZooKeeper JMX will give you information about latencies and such (you can even use the four letter words for that if you want to hack up some scripts instead of using jmx). JMX will also give you insight into the JVM workings - so for example you could confirm/ruleout the scenario outlined by Nitay (gc causing the jvm java threads to hang for 30sec at a time, including the ZK heartbeat). I've seen similar to what you describe a few times now, in each case it was something different. In one case for example there was a cluster of 5k clients attaching to a ZK cluster, ~20% of the clients had mis-configured nics, that was causing high tcp packet loss (and therefore high network latency), which caused a similar situation to what you are seeing, but only under fairly high network load (which made it hard to track down!). I've also seen situations where ppl run the entire zk cluster on a set of VMWare vms, all on the same host system. Latency on this configuration was 10sec in some cases due to resource issues (in particular io - see the link I provided above, dedicated log devices are critical to low latency operation of the ZK cluster). In your scenario I think 5 sec timeout is too low, probably much too low. Why? You are running in virtualized environments on non-dedicated hardware outside your control/inspection. There is typically no way to tell (unless you are running on the 8 core ec2 systems) if the ec2 host you are running on is over/under subscribed (other vms). There is no way to control disk latency either. You could be seeing large latencies due to resource contention on the ec2 host alone. In addition to that I've heard that network latencies in ec2 are high relative to what you would see if you were running on your own dedicated environment. It's hard to tell the latency btw the servers and client-server w/in the ec2 environment you are seeing w/out measuring it. Keep in mind the the timeout period is used by both the client and the server. If the ZK leader doesn't hear from the client w/in the timeout (say it's 5 sec) it will expire the session. The client is sending a ping after 1/3 of the timeout period. It expects to hear a response before another 1/3 of the timeout elapses, after which it will attempt to re-sync to another server in the cluster. In the 5 sec timeout case you are allowing 1.3 seconds for the request to go to the server, the server to respond back to the client, and the client to process the response. Check the latencies in ZK's JMX as I suggested to the hbase team in order to get insight into this (i.e. if the server latency is high, say because of io issues, or jvm swapping, vm latency, etc... that will cause the client/sessions to timeout) Hope this helps. Patrick Mahadev Konar wrote: Hi Ted, These problems seem to manifest around getting lots of anomalous disconnects and session expirations even though we have the timeout values set to 2 seconds on the server side and 5 seconds on the client side. Your scenario might be a little differetn from what Nitay (Hbase) is seeing. In their scenario the zookeeper client was not able to send out pings to the server due to gc stalling threads in their zookeeper application process. The latencies in zookeeper clients are directly related to Zookeeper server machines. It is very much dependant on the disk io latencies that you would get on the zookeeper servers and network latencies with your cluster. I am not sure how much sensitive you want your zookeeper application to be -- but increasing the timeout should help. Also, we recommend using dedicated disk for zookeeper log transactions. http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperAdmin.html#sc_streng thsAndLimitations Also, we have seen Ntp having problems and clocks going back on one of our vm setup. This would lead to session getting timed out earler than the set session timeout. I hope this helps. mahadev On 4/14/09 5:48 PM, Ted Dunning ted.dunn...@gmail.com wrote: We have been using EC2 as a substrate for our search cluster with zookeeper as our coordination layer and have been seeing some strange problems. These problems seem to manifest around getting lots of anomalous disconnects and session expirations even though we have the timeout values set to 2 seconds on the server side and 5 seconds on the client side. Has anybody else been seeing this? Is this related to clock jumps in a virtualized setting? On a related note, what is best practice for handling session expiration? Just deal with it as if it is a new start?
Re: problems on EC2?
Well that's good - 300ms max latency means that the server can round trip any requests pretty quickly. It would lead me to look at the client VMs or (intermittent) network problems... Keep in mind though that's one of your servers (unless you are saying you checked all X of the servers in the cluster and that was the overall max?). You may discover one server that has issues while the other servers are fine. In which case only clients connected to the bad server(s) will experience problems. (and since clients can jump btw that might be contributing the the randomness in observed occurrence) Good luck and keep us posted. EC2 is very interesting, I'd like to learn more about the operating environment and in particular the issues involved with running ZK there. Patrick Ted Dunning wrote: Patrick, Thanks enormously. This hasn't helped yet, but that is just because it was a very large bite of the apple. Once I digest it, I can tell that it will be very helpful. I did have a chance to look at the stat output and maximum latency was 300ms. How that connects with what you are saying isn't clear yet, but I can see how that might not be diagnostic of whether the server side timeout is sufficiently long. Thanks again. On Thu, Apr 16, 2009 at 10:57 AM, Patrick Hunt ph...@apache.org wrote: lots of stuff about monitoring ... jmx ... packet loss ... vm latencies ... timeout details. ... Hope this helps. Patrick
Re: starting replicated ZK server
Jun Rao wrote: From the ZK web site, it's not clear how to set up a multi-node ZK service. It seems that one has to add the server entries in the conf file and create myid files on each node. Then, how should I start the ZK nodes? I tried issuing zkServer start from each node and that didn't seem to work. Is that the right way to start the service? Also, once the service is up, is there a way to check the list of nodes used by ZK and which node is the leader? Thanks, See this for a step-by-step guide: http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_zkMulitServerSetup In particular steps 4 5. your config should look something like (for 3 server ensemble): tickTime=2000 dataDir=/path/to/data/dir clientPort=2181 initLimit=5 syncLimit=2 server.1=host1.foo.com:2888:2889 server.2=host2.foo.com:2888:2889 server.3=host3.foo.com:2888:2889 this file needs to exist (available to) each of the servers in the ensemble. Try the command line listed in step 5 to start the server (conf in the classpath is there to pickup the log4j configuration, you could use that to debug). The myid file needs to exist in each server's dataDir (not shared of course). Either JMX, the logs, or the stat command documented here: http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_zkCommands will give insight on things like which server is the leader. The stat command is the simplest (nc to the clientPort of the server and issue the stat command). Patrick
Re: ZooKeeper Perl module
Hey Chris this is really great! Thanks for making it available to the community, very cool. Patrick Chris Darroch wrote: Hi -- The http://wiki.apache.org/hadoop/ZooKeeper page includes the comment that someday we hope to get Python, Perl, and REST interfaces. I hope I can help with one item from that list now, at least. I recently put together a Perl module named Net::ZooKeeper which is now available on CPAN: http://cpan.org/modules/by-category/05_Networking_Devices_IPC/Net/Net-ZooKeeper-0.32.tar.gz http://search.cpan.org/~cdarroch/Net-ZooKeeper-0.32/ZooKeeper.pm Modelled on the DBI module, it provides an interface to ZooKeeper through the synchronous C API functions, e.g.: my $zkh = Net::ZooKeeper-new('localhost:7000'); my $ret = $zkh-set('/foo', 'baz'); Net::ZooKeeper currently requires ZooKeeper 3.1.1 (or at least that version of the C API code) and Perl 5.8.8 or up, including 5.10.x. The test suite is reasonably complete, I think, and covers a fair bit of ground. I've found it useful for testing the ZooKeeper C API as well as learning more than I wanted to know about XS programming. I've licensed the module under the Apache license 2.0 so it should be compatible with ZooKeeper itself if there's interest in including it under src/contribs. For those who ask why Perl 5 and not Rakudo/Ruby/Lua/Python/ [insert cool new dynamic language here], the answer is just that I needed an old-style Perl module first. (As a thought experiment, though, I wonder if one could write a Parrot extension that communicated directly with ZooKeeper, handled the ping requests internally via a Parrot scheduler/thread/whatever, and didn't need the C API at all. You could support any language running on Parrot with that. Well, maybe in a few years, anyway. :-) In the meantime, please report any suggestions or bugs to me -- thanks! Chris.
[ANNOUNCE] Apache ZooKeeper 3.1.1
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version 3.1.1. ZooKeeper is a high-performance coordination service for distributed applications. It exposes common services - such as naming, configuration management, synchronization, and group services - in a simple interface so you don't have to write them from scratch. You can use it off-the-shelf to implement consensus, group management, leader election, and presence protocols. And you can build on it for your own, specific needs. If you are upgrading from version 2.2.1 on SourceForge be sure to review the 3.0.1 release notes for migration instructions. For ZooKeeper release details and downloads, visit: http://hadoop.apache.org/zookeeper/releases.html ZooKeeper 3.1.1 Release Notes are at: http://hadoop.apache.org/zookeeper/docs/r3.1.1/releasenotes.html Regards, The ZooKeeper Team
Re: Semantics of ConnectionLoss exception
Mahadev Konar wrote: Hi Nitay, - Does this event happening mean my ephemeral nodes will go away? No. the client will try connecting to other servers and if its not able to reconnect to the servers within the remaining session timeout. If the client is not able to connect within the remaining session timeout, the session will expire and you will get a session expired event. Isn't it the case that the client won't get session expired until it's able to connect to a server, right? So what might happen is that the client loses connection to the server, the server eventually expires the client and deletes ephemerals (notifying all watchers) but the client won't see the session expiration until it is able to reconnect to one of the servers. ie the client doesn't know it's been expired until it's able to reconnect to the cluster, at which point it's notified that it's been expired. http://hadoop.apache.org/zookeeper/docs/r3.0.1/zookeeperProgrammers.html Has this information scattered around, but we should put it in the FAQ specifically. 3.0.1 is a bit old, try this for the latest docs: http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html - Is the ZooKeeper handle I'm using dead after this event? Again no. your handle is valid until you get an session expiry event or you do a zoo_close on your handle. Thanks mahadev On 3/25/09 5:42 PM, Nitay nit...@gmail.com wrote: I'm a little unclear about the ConnectionLoss exception as it's described in the FAQ and would like some clarification. From the state diagram, http://wiki.apache.org/hadoop/ZooKeeper/FAQ#1, there are three events that cause a ConnectionLoss: 1) In Connecting state, call close(). 2) In Connected state, call close(). 3) In Connected state, get disconnected. It's the third one I'm unclear about. - Does this event happening mean my ephemeral nodes will go away? - Is the ZooKeeper handle I'm using dead after this event? Meaning that, similar to the SessionExpired case, I need to construct a new connection handle to ZooKeeper and take care of the restarting myself. It seems from the diagram that this should not be the case. Rather, seeing as the disconnected event sends the user back to the Connecting state, my handle should be fine and the library will keep trying to reconnect to ZooKeeper internally? I understand my current operation may have failed, what I'm asking about is future operations. Thanks, -n
Re: Contrib section (nee Re: A modest proposal for simplifying zookeeper :)
Hi Anthony. We have a contrib in the current release, it's under src. I'm not sure I understand, what is contrib section referring to? Or do you mean client recipe implementations? (like ZOOKEEPER-78, which is being worked on for 3.2) Patrick Anthony Urso wrote: So does this mean no contrib section? On Thu, Feb 26, 2009 at 10:00 PM, Patrick Hunt ph...@apache.org wrote: So far we've stayed with the process used by core as this minimizes the amount of work we need to do re process/build/release, etc... we just copy the process/build/release etc... used in core, we get all that for free. I'm hesitant to diverge as this will increase the amount of work we need to do. Core has moved to Ivy, we may move to that at some point, but currently we're focused on adding functionality, fixing bugs -- not changing build. Patrick Anthony Urso wrote: Speaking of the contrib section, what is the status of ZOOKEEPER-103? Is it ready to be reevaluated now that 3.0 is out? Cheers, Anthony On Fri, Jan 9, 2009 at 11:58 AM, Mahadev Konar maha...@yahoo-inc.com wrote: Hi Kevin, It would be great to have such high level interfaces. It could be something that you could contribute :) . We havent had the bandwidth to provide such interfaces for zookeeper. It would be great to have all such recipes as a part of contrib package of zookeeper. mahadev On 1/9/09 11:44 AM, Kevin Burton bur...@spinn3r.com wrote: OK so it sounds from the group that there are still reasons to provide rope in ZK to enable algorithms like leader election. Couldn't ZK ship higher level interfaces for leader election, mutexes, semapores, queues, barriers, etc instead of pushing this on developers? Then the remaining APIs, configuration, event notification, and discovery, can be used on a simpler, rope free API. The rope is what's killing me now :) Kevin
Re: Contrib section (nee Re: A modest proposal for simplifying zookeeper :)
Ben, you might want to look at buildr, it recently graduated from the apache incubator: http://buildr.apache.org/ Buildr is a build system for Java applications. We wanted something that’s simple and intuitive to use, so we only need to tell it what to do, and it takes care of the rest. But also something we can easily extend for those one-off tasks, with a language that’s a joy to use. And of course, we wanted it to be fast, reliable and have outstanding dependency management. Also Ivy just released version 2.0. If you have a specific idea and would like to start working on this please create a JIRA to discuss/track/vote/etc... Be aware that the contribution process, release process and other documentation would have to be updated as part of this. For example if we want to push jars to an artifact repo the artifacts/pom/etc... would have to be voted on as part of the release process. Patrick Benjamin Reed wrote: i'm ready to reevaluate it. i did the contrib for fatjar and it was extremely painful! (and that was an extremely simple contrib!) we really want to ramp up the contribs and get a bunch of recipe implementations in, so we need something that makes it really easy. i'm not a fan of maven (they seem to have chosen a convention that is convenient for the build tool rather the developer), but it is widely used and i we need something better, so i'm certainly considering it. ben Anthony Urso wrote: Speaking of the contrib section, what is the status of ZOOKEEPER-103? Is it ready to be reevaluated now that 3.0 is out? Cheers, Anthony On Fri, Jan 9, 2009 at 11:58 AM, Mahadev Konar maha...@yahoo-inc.com wrote: Hi Kevin, It would be great to have such high level interfaces. It could be something that you could contribute :) . We havent had the bandwidth to provide such interfaces for zookeeper. It would be great to have all such recipes as a part of contrib package of zookeeper. mahadev On 1/9/09 11:44 AM, Kevin Burton bur...@spinn3r.com wrote: OK so it sounds from the group that there are still reasons to provide rope in ZK to enable algorithms like leader election. Couldn't ZK ship higher level interfaces for leader election, mutexes, semapores, queues, barriers, etc instead of pushing this on developers? Then the remaining APIs, configuration, event notification, and discovery, can be used on a simpler, rope free API. The rope is what's killing me now :) Kevin
Re: Adding a server to a running ensemble?
we do have an open issue to do this more on the fly without having to do the bounce, but it is behind other priorities in the work queue. This is the JIRA: https://issues.apache.org/jira/browse/ZOOKEEPER-107 in case someone would like to work on this. http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute This is a significant feature which would require design work before implementation starts. Please move discussion to the dev list (actually documenting/discussing on the JIRA itself would be great). Patrick Benjamin Reed wrote: ben Chad Harrington wrote: We are investigating Ensemble and a key question came up: How does one add a server to a running ensemble of Zookeeper servers in a 24/7 environment? If I have a 3-server ensemble and traffic grows to the point where I need another 2 servers, how do I add them without shutting everything down and restarting? Thanks for your help, Chad Harrington CEO DataScaler, Inc. charring...@datascaler.com 201A Ravendale Dr. Mountain View, CA 94043 Phone: 650-515-3437 Fax: 650-887-1544
Re: Recommended session timeout
That's very interesting results, a good job sleuthing. You might try the concurrent collector? http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#available_collectors.selecting specifically item 4 -XX:+UseConcMarkSweepGC I've never used this before myself but it's supposed to reduce the gc pauses to less than a second. Might require some tuning though... Patrick Joey Echeverria wrote: I've answered the questions you asked previously below, but I thought I would open with the actual culprit now that we found it. When I said loading data before, what I was talking about was sending data via Thrift to the machine that was getting disconnected from zookeeper. This turned out to be the problem. Too much data was being sent in short span of time and this caused memory pressure on the heap. This increased the fraction of the time that the GC had to run to keep up. During a 143 second test, the GC was running for 33 seconds. We found this by running tcpdump on both the machine running the ensemble server and the machine connecting to zookeeper as a client. We deduced it wasn't a network (lost packet) issue, as we never saw unmatched packets in our tests. What did see were long 2-7 second pauses with no packets being sent. We first attempted to up the priority of the zookeeper threads to see if that would help. When it didn't, we started monitoring the GC time. We don't have a work around yet, other than sending data in smaller batches and using a longer sessionTimeout. Thanks for all your help! -Joey As an experiment try increasing the timeout to say 30 seconds and re-run your tests. Any change? 30 seconds and higher works fine. loading data - could you explain a bit more about what you mean by this? If you are able to provide enough information for us to replicate we could try it out (also provide info on your ensemble configuration as Mahadev suggested) The ensemble config file looks as follows: tickTime=2000 dataDir=/data/zk clientPort=2181 initLimit=5 syncLimit=2 skipACL=true server.1=server1:2888:3888 ... server.7=server7:2888:3888 You are referring to startConnect in SendThread? We randomly sleep up to 1 second to ensure that the clients don't all storm the server(s) after a bounce. That makes some sense, but it might be worth tweaking that parameter based on sessionTimeout since 1 second can easily be 10-20% of sessionTimeout. 1) configure your test client to connect to 1 server in the ensemble 2) run the srst command on that server 3) run your client test 4) run the stat command on that server 5) if the test takes some time, run the stat a few times during the test to get more data points The problem doesn't appear to be on the server end as max latency never went above 5ms. Also, no messages are shown as queued.
Re: Contrib section (nee Re: A modest proposal for simplifying zookeeper :)
So far we've stayed with the process used by core as this minimizes the amount of work we need to do re process/build/release, etc... we just copy the process/build/release etc... used in core, we get all that for free. I'm hesitant to diverge as this will increase the amount of work we need to do. Core has moved to Ivy, we may move to that at some point, but currently we're focused on adding functionality, fixing bugs -- not changing build. Patrick Anthony Urso wrote: Speaking of the contrib section, what is the status of ZOOKEEPER-103? Is it ready to be reevaluated now that 3.0 is out? Cheers, Anthony On Fri, Jan 9, 2009 at 11:58 AM, Mahadev Konar maha...@yahoo-inc.com wrote: Hi Kevin, It would be great to have such high level interfaces. It could be something that you could contribute :) . We havent had the bandwidth to provide such interfaces for zookeeper. It would be great to have all such recipes as a part of contrib package of zookeeper. mahadev On 1/9/09 11:44 AM, Kevin Burton bur...@spinn3r.com wrote: OK so it sounds from the group that there are still reasons to provide rope in ZK to enable algorithms like leader election. Couldn't ZK ship higher level interfaces for leader election, mutexes, semapores, queues, barriers, etc instead of pushing this on developers? Then the remaining APIs, configuration, event notification, and discovery, can be used on a simpler, rope free API. The rope is what's killing me now :) Kevin
Re: Recommended session timeout
The latest docs (3.1.0 has some updates to that section) can be found here: http://hadoop.apache.org/zookeeper/docs/r3.1.0/zookeeperProgrammers.html#ch_zkSessions Patrick Mahadev Konar wrote: Hi Joey, here is a link to information on session timeouts. http://hadoop.apache.org/zookeeper/docs/r3.0.1/zookeeperProgrammers.html#ch_ zkSessions The session timeouts depends on how sensitive you want your application to be. A very low session timeout like (1-2 seconds) might lead to your application being very sensitive to events like minor network problems etc., a higher values of say (30 seconds) on the other hand might lead to slow detection of client failures -- example one of the zookeeper client which has ephemeral node goes down, in this case the ephemeral nodes will only go away after session timeout. I have seen some users using 10-15 seconds of session timeout, but you should use as per your application requirements. Hope this helps. mahadev On 2/22/09 3:09 AM, Joey Echeverria joe...@gmail.com wrote: Is there a recommended session timeout? Does it change based on the ensemble size? Thanks, -Joey
What are you using ZooKeeper for?
If you are using ZK and can publicly share this information please update the wiki PoweredBy page: http://wiki.apache.org/hadoop/ZooKeeper/PoweredBy Patrick
Re: Watcher guarantees
Tom White wrote: If client sets a watcher on a znode by doing a getData operation is it guaranteed to get the next change after the value it read, or can a change be missed? In other words if the value it read had zxid z1 and the next update of the znode has zxid z2, will the watcher always get the event for the change z2? Yes, that is a strong guarantee. See the following: http://hadoop.apache.org/zookeeper/docs/r3.0.1/zookeeperProgrammers.html#sc_WatchGuarantees Patrick
[ANNOUNCE] Apache ZooKeeper 3.1.0
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version 3.1.0. ZooKeeper is a high-performance coordination service for distributed applications. It exposes common services - such as naming, configuration management, synchronization, and group services - in a simple interface so you don't have to write them from scratch. You can use it off-the-shelf to implement consensus, group management, leader election, and presence protocols. And you can build on it for your own, specific needs. Key features of the 3.1.0 release: * Quota support * BookKeeper - a system to reliably log streams of records * JMX for server management * many fixes, improvements, improved documentation, etc... A bit about BookKeeper: a system to reliably log streams of records. In BookKeeper, servers are bookies, log streams are ledgers, and each unit of a log (aka record) is a ledger entry. BookKeeper is designed to be reliable; bookies, the servers that store ledgers can be byzantine, which means that some subset of the bookies can fail, corrupt data, discard data, but as long as there are enough correctly behaving servers the service as a whole behaves correctly; the meta data for BookKeeper is stored in ZooKeeper. For ZooKeeper release details and downloads, visit: http://hadoop.apache.org/zookeeper/releases.html ZooKeeper 3.1.0 Release Notes are at: http://hadoop.apache.org/zookeeper/docs/r3.1.0/releasenotes.html Regards, The ZooKeeper Team
Re: Dealing with session expired
Ephemerals and watches are maintained across disconnect/reconnect btw the client and server however session expiration (or closing the session explicitly) will trigger deletion of ephemeral nodes associated with the session. Right - once the session is expired the id is invalid. You need to create a new session (new id). Btw, the timeout value you provide to when constructing the zookeeper client session directly effects the session expiration - the server uses this timeout as the session expiration time. Patrick Tom Nichols wrote: So if a session expires, my ephemeral nodes and watches have already disappeared? I suppose creating a new ZK instance with the old session ID would not do me any good in that case. Correct? Thanks. -Tom On Thu, Feb 12, 2009 at 2:12 PM, Mahadev Konar maha...@yahoo-inc.com wrote: Hi Tom, We prefer to discard the zookeeper instance if a session expires. Maintaining a one to one relationship between a client handle and a session makes it much simpler for users to understand the existence and disappearance of ephemeral nodes and watches created by a zookeeper client. thanks mahadev On 2/12/09 10:58 AM, Tom Nichols tmnich...@gmail.com wrote: I've come across the situation where a ZK instance will have an expired connection and therefore all operations fail. Now AFAIK the only way to recover is to create a new ZK instance with the old session ID, correct? Now, my problem is, the ZK instance may be shared -- not between threads -- but maybe two classes in the same thread synchronize on different nodes by using different watchers. So it makes sense that one ZK client instance can handle this. Except that even if I detect the session expiration by catching the KeeperException, if I want to resume the session, I have to create a new ZK instance and pass it to any classes who were previously sharing the same instance. Does this make sense so far? Anyway, bottom line is, it would be nice if a ZK instance could itself recover a session rather than discarding that instance and creating a new one. Thoughts? Thanks in advance, -Tom
Re: Dealing with session expired
Regardless of frequency Tom's code still has to handle this situation. I would suggest that the two classes Tom is referring to in his mail, the ones that use ZK client object, should either be able to reinitialize with a new zk session, or they themselves should be discarded and new instances created using the new session (not sure what makes more sense for his archi...) Regardless of whether we reuse the session object or create a new one I believe the code using the session needs to reinitialize in some way -- there's been a dramatic break from the cluster. As I mentioned, you can decrease the likelihood of expiration by increasing the timeout - but the downside is that you are less sensitive to clients dying (because their ephemeral nodes don't get deleted till close/expire and if you are doing something like leader election among your clients it will take longer for the followers to be notified). Patrick Mahadev Konar wrote: Hi Tom, The session expired event means that the the server expired the client and that means the watches and ephemrals will go away for that node. How are you running your zookeeper quorum? Session expiry event should be really rare event . If you have a quorum of servers it should rarely happen. mahadev On 2/12/09 11:17 AM, Tom Nichols tmnich...@gmail.com wrote: So if a session expires, my ephemeral nodes and watches have already disappeared? I suppose creating a new ZK instance with the old session ID would not do me any good in that case. Correct? Thanks. -Tom On Thu, Feb 12, 2009 at 2:12 PM, Mahadev Konar maha...@yahoo-inc.com wrote: Hi Tom, We prefer to discard the zookeeper instance if a session expires. Maintaining a one to one relationship between a client handle and a session makes it much simpler for users to understand the existence and disappearance of ephemeral nodes and watches created by a zookeeper client. thanks mahadev On 2/12/09 10:58 AM, Tom Nichols tmnich...@gmail.com wrote: I've come across the situation where a ZK instance will have an expired connection and therefore all operations fail. Now AFAIK the only way to recover is to create a new ZK instance with the old session ID, correct? Now, my problem is, the ZK instance may be shared -- not between threads -- but maybe two classes in the same thread synchronize on different nodes by using different watchers. So it makes sense that one ZK client instance can handle this. Except that even if I detect the session expiration by catching the KeeperException, if I want to resume the session, I have to create a new ZK instance and pass it to any classes who were previously sharing the same instance. Does this make sense so far? Anyway, bottom line is, it would be nice if a ZK instance could itself recover a session rather than discarding that instance and creating a new one. Thoughts? Thanks in advance, -Tom
Re: Dealing with session expired
Tom, you might try changing the log4j default log level to DEBUG for the rootlogger and appender if you have not already done so (servers and clients both). You'll get more information to aid debugging if it does occur again. http://hadoop.apache.org/zookeeper/docs/r3.0.1/zookeeperAdmin.html#sc_logging Also, are you seeing timeouts on the client, or just session expiration on the server? The stat command, detailed here, may also be of use to you: http://hadoop.apache.org/zookeeper/docs/r3.0.1/zookeeperAdmin.html#sc_zkCommands Knowing more about your env, OS java version in particular, would also help us help you narrow things down. :-) Patrick Tom Nichols wrote: On Thu, Feb 12, 2009 at 4:11 PM, Benjamin Reed br...@yahoo-inc.com wrote: idleness is not a problem. the client library sends heartbeats to keep the session alive. the client library will also handle reconnects automatically if a server dies. That's odd then that I'm seeing this problem. I have a local, 3-node zookeeper quorum, and I have 3 instances of the client also running on the same box. The session expiry doesn't seem to be in response to any severe load on the machine or anything like that. I'll keep an eye on it and see if I can't reproduce the behavior in a distributed environment. I've realized a relatively easy way to deal with this problem -- I can let my thread throw a fatal unchecked exception and then use a ThreadGroup implementation that catches the exception. This in turn spawns a new client thread and adds it back to the same threadGroup. Thanks again guys. -Tom since session expiration really is a rare catastrophic event. (or at least it should be.) it is probably easiest to deal with it by starting with a fresh instance if your session expires. ben From: Tom Nichols [tmnich...@gmail.com] Sent: Thursday, February 12, 2009 11:53 AM To: zookeeper-user@hadoop.apache.org Subject: Re: Dealing with session expired I'm using a timeout of 5000ms. Now let me ask this: Suppose all of my clients are waiting on some external event -- not ZooKeeper -- so they are all idle and are not touching ZK nodes, nor are they calling exists, getChildren, etc etc. Can that idleness cause session expiry? I'm running a local quorum of 3 nodes. That is, I have an Ant script that kicks off 3 java tasks in parallel to run ConsumerPeerMain, each with its own config file. Regarding handling of the failure, I suspect I will just have to reinitialize by creating a new instance of my client(s) that themselves will have a new ZK instance. I'm using Spring to wire everything together, which is why it's particularly difficult to simply re-create a new ZK instance and pass it to the classes using it (those classes have no knowledge of each other). But I _can_ just pull a freshly-created (prototype) instance from the Spring application context, which is where a new ZK client will be wired in. The only ramification there is I have to throw the KeeperException as a fatal exception rather than letting that client try to re-elect. Or maybe add in some logic to say if I can't re-elect, _then_ throw an exception and consider it fatal. Thanks guys. -Tom On Thu, Feb 12, 2009 at 2:39 PM, Patrick Hunt ph...@apache.org wrote: Regardless of frequency Tom's code still has to handle this situation. I would suggest that the two classes Tom is referring to in his mail, the ones that use ZK client object, should either be able to reinitialize with a new zk session, or they themselves should be discarded and new instances created using the new session (not sure what makes more sense for his archi...) Regardless of whether we reuse the session object or create a new one I believe the code using the session needs to reinitialize in some way -- there's been a dramatic break from the cluster. As I mentioned, you can decrease the likelihood of expiration by increasing the timeout - but the downside is that you are less sensitive to clients dying (because their ephemeral nodes don't get deleted till close/expire and if you are doing something like leader election among your clients it will take longer for the followers to be notified). Patrick Mahadev Konar wrote: Hi Tom, The session expired event means that the the server expired the client and that means the watches and ephemrals will go away for that node. How are you running your zookeeper quorum? Session expiry event should be really rare event . If you have a quorum of servers it should rarely happen. mahadev On 2/12/09 11:17 AM, Tom Nichols tmnich...@gmail.com wrote: So if a session expires, my ephemeral nodes and watches have already disappeared? I suppose creating a new ZK instance with the old session ID would not do me any good in that case. Correct? Thanks. -Tom On Thu, Feb 12, 2009 at 2:12 PM, Mahadev Konar maha...@yahoo-inc.com wrote: Hi Tom, We prefer to discard the zookeeper instance if a session expires. Maintaining
Re: ZooKeeper 3.1 and C API/ABI
Chris, that's unfortunate re the version number (config.h), but I think I see why that is -- config.h should only really be visible in the implementation, not exposed through the includes. I've created a JIRA for this: https://issues.apache.org/jira/browse/ZOOKEEPER-293 We'll hold 3.1 for this JIRA, I'll create a new release candidate when the patch is ready. (hopefully today) Ben, Mahadev please be available to review/fasttrack this JIRA. Patrick Chris Darroch wrote: Hi -- Btw, the version is in the config.h file, generated by autotools, as VERSION. We don't break that out as individual parameters but we can if there is interest. That's useful, I'd missed that. Thanks; that should work for me for now. Sorry ... on second glance, I'll have to retract that. The problem here is that config.h doesn't get installed by make install, it's just used by the autoconf stuff. So there's no simple way that I'm aware of to check the C API version at compile time. For 3.1.0, I'd suggest either reverting ZOOKEEPER-255 until 4.0.0, or making sure there's at least a way of determining the API version using C macros. For example, I'd want to be able to do something like: #if ZOO_MAJOR_VERSION = 3 ZOO_MINOR_VERSION = 1 zoo_set(..., stat); #else zoo_set(...); #endif Ideally, as I mentioned, until 4.0.0 the zoo_set() functionality would be moved to a zoo_stat_set() or zoo_set_exec() function, and zoo_set() would keep its existing definition but become just a wrapper that invoked the new function with a NULL stat argument. That would be the APR way, I think, of handling this situation. With the next major version the new function with the extra argument could be renamed back to zoo_set(). It's slightly ugly, I know, if you're thinking of this as a bug which needs to be fixed urgently. If you're not concerned about backward API compatibility, at a minimum I'd request externally visible macros in zookeeper.h for 3.1.0: #define ZOO_MAJOR_VERSION 3 #define ZOO_MINOR_VERSION 1 #define ZOO_PATCH_VERSION 0 Thanks, Chris.
Re: ZooKeeper 3.1 and C API/ABI
Chris, please take a look at the patch on 293 asap and let us know if you have any issues -- I will be spinning a new release once mahadev/ben review and commit the change. Patrick ps. I noticed you had some additional suggestions for the c code in JIRA, thanks. FYI we do accept contributions from anyone. ;-) Patrick Hunt wrote: Chris, that's unfortunate re the version number (config.h), but I think I see why that is -- config.h should only really be visible in the implementation, not exposed through the includes. I've created a JIRA for this: https://issues.apache.org/jira/browse/ZOOKEEPER-293 We'll hold 3.1 for this JIRA, I'll create a new release candidate when the patch is ready. (hopefully today) Ben, Mahadev please be available to review/fasttrack this JIRA. Patrick Chris Darroch wrote: Hi -- Btw, the version is in the config.h file, generated by autotools, as VERSION. We don't break that out as individual parameters but we can if there is interest. That's useful, I'd missed that. Thanks; that should work for me for now. Sorry ... on second glance, I'll have to retract that. The problem here is that config.h doesn't get installed by make install, it's just used by the autoconf stuff. So there's no simple way that I'm aware of to check the C API version at compile time. For 3.1.0, I'd suggest either reverting ZOOKEEPER-255 until 4.0.0, or making sure there's at least a way of determining the API version using C macros. For example, I'd want to be able to do something like: #if ZOO_MAJOR_VERSION = 3 ZOO_MINOR_VERSION = 1 zoo_set(..., stat); #else zoo_set(...); #endif Ideally, as I mentioned, until 4.0.0 the zoo_set() functionality would be moved to a zoo_stat_set() or zoo_set_exec() function, and zoo_set() would keep its existing definition but become just a wrapper that invoked the new function with a NULL stat argument. That would be the APR way, I think, of handling this situation. With the next major version the new function with the extra argument could be renamed back to zoo_set(). It's slightly ugly, I know, if you're thinking of this as a bug which needs to be fixed urgently. If you're not concerned about backward API compatibility, at a minimum I'd request externally visible macros in zookeeper.h for 3.1.0: #define ZOO_MAJOR_VERSION 3 #define ZOO_MINOR_VERSION 1 #define ZOO_PATCH_VERSION 0 Thanks, Chris.
ZooKeeper 3.1 release process starting today.
All 3.1 issues have been resolved, I'll be starting the release process today, detailed here: http://wiki.apache.org/hadoop/ZooKeeper/HowToRelease If voting is timely successful an official release should be available early/mid next week. You can follow more closely on the zookeeper-dev list. The full list of 67 JIRAs addressed in this release is available: https://issues.apache.org/jira/browse/ZOOKEEPER?report=com.atlassian.jira.plugin.system.project:roadmap-panel Patrick
Re: Delaying 3.1 release by 2 to 3 weeks?
Mahadev, can you complete quotas in 2 weeks? This includes completing the code itself, documentation, tests, and incorporating review feedback? Parick Benjamin Reed wrote: we should delay. it would be good to try out quotas for a bit before we do the release. quotas are also a key part of the release. 3 weeks seem a little long though. ben From: Mahadev Konar [maha...@yahoo-inc.com] Sent: Thursday, January 15, 2009 4:32 PM To: zookeeper-...@hadoop.apache.org Cc: zookeeper-user@hadoop.apache.org Subject: Re: Delaying 3.1 release by 2 to 3 weeks? That was release 3.1 and not 3.2 :) mahadev On 1/15/09 4:26 PM, Mahadev Konar maha...@yahoo-inc.com wrote: Hi all, I needed to get quotas in zookeeper 3.2.0 and wanted to see if delaying the release by 2-3 weeks is ok with everyone? Here is the jira for it - http://issues.apache.org/jira/browse/ZOOKEEPER-231 Please respond if you have any issues with the delay. thanks mahadev
Re: Reconnecting to another host on failure but before session expires...
There's also been interest in having a chroot type capability as part of the connect string: host:port/app/abc,... where the client's session would be rooted at /app/abc rather than / This is very useful in multi-tenant situations (more than 1 app sharing a zk cluster). Patrick Benjamin Reed wrote: Using a string gives us some flexibility. There is an outstanding issue to be able to pass in a URL: http://aeoueu/oeueue, the idea being that we pull down the content to get the list of servers and ports. ben From: thomas.john...@sun.com [thomas.john...@sun.com] Sent: Wednesday, January 07, 2009 8:35 AM To: zookeeper-user@hadoop.apache.org Subject: Re: Reconnecting to another host on failure but before session expires... Kevin Burton wrote: Crazy, I don't know how I missed that... Wouldn't it be cleaner to specify this as a ListString of host:port names? If API cleanup was being considered, my inclination would have been ListInetSocketAddress.
Re: Simpler ZooKeeper event interface....
Kevin Burton wrote: 3) it's possible for your code to get notified of a change, but never process the change. This might happen if: a) a node changed watch fires b) your client code runs an async getData c) you are disconnected from the server Also, this seems very confusing... If I run an async request, the client should replay these if I'm reconnected to another host. (Ben/Flavio/Mahadev can correct me if I'm wrong here or missed some detail) Async operations are tricky as the server makes the change when it gets the request, not when the client processes the response. So you could request an async operation, which the server could process and respond to the client, immed. after which the client is disconnected from the server (before it can process the response). Client replay would not work in this case, and given that async is typically used for high throughput situations there could be a number of operations effected. Patrick
Re: Simpler ZooKeeper event interface....
To say that it will never return is not correct. The client will be notified of connectionloss in the callback, however the client will not know if the operation was successful (from point of view of the server) or not. Patrick Kevin Burton wrote: On Wed, Jan 7, 2009 at 11:12 AM, Mahadev Konar maha...@yahoo-inc.comwrote: You are right Pat. Replaying an async operation would involve a lot of state management for clients across servers and would involve a lot more work in determining which operation succeeded and the one which needs to be re run and the semantics of zookeeper client calls would be much harder to guarantee. OK. so maybe a sane middle ground would be to put a warning in the code that an async operation might never return. I think a generalization about ZK right now (at least based on my current perspective) is that it makes it too easy to run with scissors. ZK probably will work fine with all servers in an ensemble connected but if one goes away you need to be VERY careful about how you code your app. Kevin
Re: Does session expiration only happen during total ensemble failure or network split?
Mahadev Konar wrote: Why would you want the session to expire if all the servers are down (which should not happen unless you kill all the nodes or the datacenter is down) ? A more likely case is that the client port on the switch dies and the client is partitioned from the servers... Patrick mahadev On 1/7/09 12:39 PM, Kevin Burton bur...@spinn3r.com wrote: The ZK ensemble leader expires the client session if it doesn't hear from the client w/in the timeout specified by the client when the session was established. A client will disconnect from a server in the ensemble and attempt reconnect to another server in the ensemble if it doesn't hear from the server w/in 2/3 of the specified session timeout. OK... I got that part. The issue I'm running into now though is that my sessions aren't actually timing out when I shutdown all servers in an ensemble. One solution/hack would be to record how long you've been disconnected and assume that your session has been expired. Kevin
Re: ouch, zookeeper infinite loop
Whatever we do the changes should support having more than one marshaling format/version co-exist. Both for b/w compatibility as well as enabling different serialization mechanisms (jute or pbuffer or thrift or etch, etc...) Patrick Mahadev Konar wrote: The version of Jute we use is really an ancient version of recordio ser/deser library in hadoop. We do want to move to some better(versioned/fast/well accepted) ser/deser library. mahadev On 1/7/09 12:08 PM, Kevin Burton bur...@spinn3r.com wrote: Ah... you think it was because it was empty? Interesting. I will have to play with Jute a bit. Kevin On Wed, Jan 7, 2009 at 10:07 AM, Patrick Hunt ph...@apache.org wrote: Thanks for the report, entered as: https://issues.apache.org/jira/browse/ZOOKEEPER-268 For the time being you can work around this by setting the threshold to INFO for that class (in log4j.properties). Either that or just set the data to a non-empty value for the znode. Patrick Kevin Burton wrote: Creating this node with this ACL: Created /foo setAcl /foo world:anyone:w Causes the exception included below. It's an infinite loop so it's just called over and over again filling my console. I'm just doing an exists( path, true ); ... setting a watch still causes the problem. java.lang.NullPointerException at org.apache.jute.Utils.toCSVBuffer(Utils.java:234) at org.apache.jute.CsvOutputArchive.writeBuffer(CsvOutputArchive.java:101) at org.apache.zookeeper.proto.GetDataResponse.toString(GetDataResponse.java:48) at java.lang.String.valueOf(String.java:2827) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.zookeeper.ClientCnxn$Packet.toString(ClientCnxn.java:230) at java.lang.String.valueOf(String.java:2827) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:586) at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:626) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:852) java.lang.NullPointerException at org.apache.jute.Utils.toCSVBuffer(Utils.java:234) at org.apache.jute.CsvOutputArchive.writeBuffer(CsvOutputArchive.java:101) at org.apache.zookeeper.proto.GetDataResponse.toString(GetDataResponse.java:48) at java.lang.String.valueOf(String.java:2827) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.zookeeper.ClientCnxn$Packet.toString(ClientCnxn.java:230) at java.lang.String.valueOf(String.java:2827) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:586) at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:626) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:852)
Re: Simpler ZooKeeper event interface....
Kevin Burton wrote: Here's a good reason for each client to know it's session status (connected/disconnected/expired). Depending on the application, if L does not have a connected session to the ensemble it may need to be careful how it acts. connected/disconnected events are given out in the current API but when I shutdown the full ensemble I don't receive a session expired. At the risk of being overly clear... ;-) When you shutdown the full ensemble all clients of the ensemble will be disconnected from the ensemble. Regardless of each client's participation in the overall application architecture they should probably consider being disconnected a bad thing and act appropriately (granted this is highly dependent on the particular use cases...). For example followers won't know who the leader is... leaders won't know if they are still leaders... etc... I'm considering implementing my own session expiration by tracking how long I've been disconnected. In the extreme you could just close the session each time you got disconnected. Patrick
Re: ActiveMQ is now using ZooKeeper
That's great, very cool! Can you create ZOOKEEPER JIRAs for these items that you've identified? First look it seems like we should be able to include these in 3.1.0, perhaps even 3.0.2. Regards, Patrick Hiram Chirino wrote: FYI: ActiveMQ has now started using ZooKeeper to do master election of it's HA clusters. So zookeeper is now going to get included in every new ActiveMQ distro we cut. Which in turn this should mean that it will get included in every ServiceMix and Geronimo distro since they repackage ActiveMQ. More details at: http://cwiki.apache.org/confluence/display/ACTIVEMQ/KahaDB+Master+Slave So this might start bringing more folks/requirements to your project. For example it would be nice if: - You had a slim client only jar which we package with ActiveMQ to reduce our footprint, since we only use the client aspect of ZooKeeper - Make the sever more embeddable (for example eliminate using static vars to initialize the sever) so that it can be wrapped up in an OSGi bundle and deployed in ServiceMix.
Re: RPM?
I'm not aware of any. Patrick Garth Patil wrote: Hi, Has anyone created an RPM or a SPEC file for Zookeeper? I thought I'd ask before I embarked on creating one. Thanks, Garth
Re: Exists Watch Triggered by Delete
Stu Hood wrote: The comment you referenced in your original email is true - that code should never execute as the existsWatches list only contains watches for NONODE watch registrations (which obv couldn't be deleted since it doesn't exist). So I am experiencing a bug then? Wow, this is a dumb mistake. The log message you pointed to is being output _every time_ this code is called. There needs to be a conditional on the log message to check the result of the remove call. I have entered a JIRA and have already submitted a patch: https://issues.apache.org/jira/browse/ZOOKEEPER-221 So unfortunately we can't tell if this is the source of the issue you are seeing. Can you run with the patch attached to ZOOKEEPER-221? It may be better not to let the programmer know about those two lists, and to see if the abstraction can be improved instead. Sometimes I feel that I have to know too much about the internal working of ZooKeeper to use its API. That's a good point. We're working to improve the docs code, I've made a note of it. Perhaps we should move that to internals doc and rework this section of the guide... https://issues.apache.org/jira/browse/ZOOKEEPER-220 Patrick -Original Message- From: Patrick Hunt [EMAIL PROTECTED] Sent: Wednesday, November 12, 2008 2:11pm To: zookeeper-user@hadoop.apache.org Subject: Re: Exists Watch Triggered by Delete Hi Stu, The zk server maintains 2 lists of watches, data and child watches: http://hadoop.apache.org/zookeeper/docs/r3.0.0/zookeeperProgrammers.html#ch_zkWatches (after reviewing this doc I've entered a jira to clarify that the server is maintaining 2 lists being referenced). From the server perspective if you register a watch on a node by calling getData exists, only a single watch, a data watch, is stored by the server. The client is maintaining lists of watches as well. This is essentially to enable the auto watch reset and multi-watcher features added in v3. Take a look at class ExistsWatchRegistration, it will register client side dataWatches for exists calls -- unless the result code is not 0 (ie NONODE), in which case it will register using existsWatches (again, client side). The comment you referenced in your original email is true - that code should never execute as the existsWatches list only contains watches for NONODE watch registrations (which obv couldn't be deleted since it doesn't exist). Hope this helps, Patrick Stu Hood wrote: I'm running the 3.0.0 release, and I'm receiving a warning thrown by this block of code: case NodeDeleted: synchronized (dataWatches) { addTo(dataWatches.remove(path), result); } // XXX This shouldn't be needed, but just in case synchronized (existWatches) { addTo(existWatches.remove(path), result); LOG.warn(We are triggering an exists watch for delete! Shouldn't happen!); } synchronized (childWatches) { addTo(childWatches.remove(path), result); } break; -- Why shouldn't an exists watch be triggered by a node being deleted? That is a really common use case in my code, so I want to rule it out as the cause of a bug I'm hunting for. Thanks, Stu Hood Architecture Software Developer Mailtrust, a Division of Rackspace
Re: ZooKeeper Roadmap - 3.1.0 and beyond.
Fernando Padilla wrote: So it sounds like we're in agreement ( at lease the few in this discussion ). But have we heard from the actual developers? What are their preferences or plans? Or would they like patches? As I stated earlier in this thread we're planning to stay with ant for many reasons, but in particular; 1) current build process works, 2) current build is based on how hadoop projects in general (core for example) is currently doing builds. By using the same process/toolset we have many benefits - in particular being able to essentially clone the core release process, saving us much time/effort. Patrick Jake Thompson wrote: Hi Hiram, I actually am just a user of zookeeper, I am not a member as of yet. I am also a user of maven and ant and have been using both for many years. So while I would say it is never a bad decision to move to maven, it isn't always a needed decision. A standard build structure makes sense if you were building zookeeper yourself, but I don't beleive you would be doing that. So that leaves the creation and building of your own projects like an ear, war, JBI, etc. The problem with zookeeper is that there is no required project structure. There is no zar that is to say. I personally have a mavenized war project that I am using zookeeper in and I also have a hand rolled CL java program that uses it and is build with ant. For both of these I just needed to copy one jar into my lib. As far as dependency management, since zookeeper is so simple the only requirement is log4j, not really needing any complex dependency tools there. As far as modularity, again I see zookeeper being part of larger modules, so I don't know if we can draw a common modular zookeeper application structure. Maven is a great tool and can help alot, but I personally don't see it as synonymous with modern java development. -Jake On Wed, Nov 5, 2008 at 9:28 PM, Hiram Chirino [EMAIL PROTECTED]wrote: It would help new developers work with your project. Maven provides a a broad set of tools that lots of java developers have come to expect out of a build system. Incorporating those tools manually into an Ant based build would be very time consuming and make the build complex to maintain. For example, in addition the standard build and package aspects of build, folks expect the build system to: - support generating the IDE integration files (Idea, eclipse, etc.). - Run static analysis tools like find bugs - Run test coverage reports - Deployment to central servers - License Checking - Artifact signing And most importantly, they want a standard way of doing all that. Maven also encourages modularity in the architecture by making it easy build multiple modules/jar files and easily describing the dependencies between then. And once you go modular, you will see how folks start contributing alternative implementations of existing modules. Copying a module and it's build setup is easy to do with maven.. A bit harder with something like ant since it's kinda monolithic. Ant was a great tool so if you guys want to stick to your guns that's cool. But in this day and age, using a ant based open source project is kinda like it was when we used make several years back to build java projects. Works fine, but dated. On Wed, Nov 5, 2008 at 1:11 PM, Jake Thompson [EMAIL PROTECTED] wrote: It is quiet around here, I am new, could you please explain why you feel a Maven build structure is needed? Thanks, Jake On Wed, Nov 5, 2008 at 1:05 PM, Hiram Chirino [EMAIL PROTECTED] wrote: Anyone out there? On Mon, Nov 3, 2008 at 9:23 AM, Hiram Chirino [EMAIL PROTECTED] wrote: Congrats on the release. Now that has been completed, I'd like to see if you guys are willing to revisit the issue of a maven based build. If yes, I'd be happy to assist making that happen. Regards, Hiram On Mon, Oct 27, 2008 at 10:35 PM, Patrick Hunt [EMAIL PROTECTED] wrote: Our first official Apache release has shipped and I'm already looking forward to 3.1.0. ;-) In particular I believe we should look at the following for 3.1.0: 1) there are a number of issues that we're targeted to 3.1.0 during the 3.0.0 cycle. We need to review and address these. 2) system test. During 3.0.0 we made significant improvements to our test environment. However we still lack a large(r) scale system test environment. It would be great if we could simulate large scale use over 10s or 100s of machines (ensemble + clients). We need some sort of framework for this, and of course tests. 3) operations documentation. In general docs were greatly improved in 3.x over 2.x. One area we are still lacking is operations docs for design/management of a ZK cluster. see https://issues.apache.org/jira/browse/ZOOKEEPER-160 4) JMX. Documentation needs to be written the code reviewed/improved. Moving to Java6 should (afaik) allow us to take advantage of improved JMX spec not available in 5. We should also
Proposal to require Java6 in 3.1.0
I've entered a JIRA targeted for ZooKeeper 3.1.0 that will add Java 6 requirement to ZooKeeper (we will drop java5 support). If you have any feedback (pos or neg) please add comments to the issue: https://issues.apache.org/jira/browse/ZOOKEEPER-210 Regards, Patrick