Re: common client
+1. I would be interested in things like this. I think it should be in some contrib/ type thing under zookeeper, like the recipes. On Mon, Jun 22, 2009 at 4:41 PM, Stefan Groschupf wrote: > Hi, > > I wonder if people are interested to work together on a zk client that > support some more functionality than zk offers by default. > Katta has this client and I copied the code into a couple other projects as > well but I'm sure it could be better than it is. > > http://katta.svn.sourceforge.net/viewvc/katta/trunk/src/main/java/net/sf/katta/zk/ZKClient.java?view=markup > > I'm sure other would benefit from such a client. > > Some of the feature are: > + Connect > + Data and StateChangeListener - subscribe once, get events until > unsubscribe > + Threadsafe > > It is not a lot of code but I'm just tired to have it duplicated so many > times. > Anyone interested to join in? Or is there something like this already? > I could just copy this to a github project. > > Stefan > >
Re: Show your ZooKeeper pride!
Added HBase. On Mon, Jun 8, 2009 at 7:01 PM, Ted Dunning wrote: > How come Yahoo isn't listed? > > On Mon, Jun 8, 2009 at 6:31 PM, Patrick Hunt wrote: > > > The Hadoop summit is Wednesday. If you're attending please feel free to > say > > hi -- Mahadev is presenting @4, Ben and I will be attending as well. > > > > Also, regardless of whether you're attending or not we'd appreciate any > > updates to the "powered by" page, if you're too busy to update it > yourself > > send us a snippet and we'll update it for you ;-) > > > > http://wiki.apache.org/hadoop/ZooKeeper/PoweredBy > > > > Regards, > > > > Patrick > > > > > > -- > Ted Dunning, CTO > DeepDyve > > 111 West Evelyn Ave. Ste. 202 > Sunnyvale, CA 94086 > http://www.deepdyve.com > 858-414-0013 (m) > 408-773-0220 (fax) >
Re: Errors during shutdown/startup of ZooKeeper
I'm still working on it (going on in parallel with a bunch of other things). Will let you guys know what I figure out as soon as I get some results. I think you are on to something Patrick. That is some gold advice. Thanks guys. -n On Wed, Jun 3, 2009 at 11:39 AM, Patrick Hunt wrote: > Nitay, any luck? Feel free to create a JIRA to track this. If you point to > the test code that's experiencing the problem we'll try and take a look. > > Patrick > > > Patrick Hunt wrote: > >> This log manifests if the client is running ahead of the server. >> >> say you have: >> 1) client connects to server A and sees some changes >> 2) client gets disconnected from A and attempts to connect to B >> 3) B can be running behind A by some number of changes (it will eventually >> catch up) >> 4) client will attempt to connect to another server that's at, or ahead of >> it's zxid until successful. >> >> why? this ensures that the client never sees old data, part of the >> guarantee you are provided when using zk. However since servers in a quorum >> can run behind (minority) then you might see this. >> >> It's unusual to see this so many times however. I see that you are running >> this as part of a junit test. Perhaps that has some impact? Are you shutting >> down servers, perhaps clearing the datadir and restarting them, w/o closing >> all of the clients? If your tests are not running in "fork mode" for junit >> (or multiple tests w/in a junit test class) then old clients can hang around >> _if not explicitly closed_ and try to re-connect to new servers that you are >> using for new tests - if the servers are starting fresh (zxid=1) then you >> can see this alot as the old (zombie) clients cannot connect to the new >> servers. Perhaps this is what you are seeing? >> >> Patrick >> >> Nitay wrote: >> >>> I see. That helps. However, even as warnings, these go on seemingly >>> endlessly. Why do they not get fixed by themselves? What are we doing >>> wrong >>> here? >>> >>> On Tue, Jun 2, 2009 at 2:24 PM, Mahadev Konar >>> wrote: >>> >>> Hi Nitay, >>>> This is not an error but should be a warning. I have opened up a jira >>>> for >>>> it. >>>> >>>> http://issues.apache.org/jira/browse/ZOOKEEPER-428 >>>> >>>> >>>> The message just says that a client is connecting to a server that is >>>> behind >>>> that a server is was connected to earlier. The log should be warn and >>>> not >>>> error and should be fixed in the next release. >>>> >>>> mahadev >>>> >>>> On 6/2/09 2:12 PM, "Nitay" wrote: >>>> >>>> Hey guys, >>>>> >>>>> We are getting a lot of messages like this in HBase: >>>>> >>>>> [junit] 2009-06-02 11:57:23,658 ERROR [NIOServerCxn.Factory:21810] >>>>> server.NIOServerCnxn(514): Client has seen zxid 0xe our last zxid is >>>>> 0xd >>>>> >>>>> For more context, the block it usually appears in is: >>>>> >>>>>[junit] 2009-06-02 13:27:54,083 INFO [main-SendThread] >>>>> zookeeper.ClientCnxn$SendThread(737): Priming connection to >>>>> java.nio.channels.SocketChannel[connected >>>>> local=/0:0:0:0:0:0:0:1%0:56511 >>>>> remote=localhost/0:0:0:0:0:0:0:1:21810] >>>>>[junit] 2009-06-02 13:27:54,084 INFO [main-SendThread] >>>>> zookeeper.ClientCnxn$SendThread(889): Server connection successful >>>>>[junit] 2009-06-02 13:27:54,093 INFO [NIOServerCxn.Factory:21810] >>>>> server.NIOServerCnxn(532): Connected to /0:0:0:0:0:0:0:1%0:56511 >>>>> lastZxid >>>>> >>>> 16 >>>> >>>>>[junit] 2009-06-02 13:27:54,094 ERROR [NIOServerCxn.Factory:21810] >>>>> server.NIOServerCnxn(543): Client has seen zxid 0x10 our last zxid is >>>>> 0x4 >>>>>[junit] 2009-06-02 13:27:54,094 WARN [NIOServerCxn.Factory:21810] >>>>> server.NIOServerCnxn(444): Exception causing close of session 0x0 due >>>>> to >>>>> java.io.IOException: Client has seen zxid 0x10 our last zxid is 0x4 >>>>>[junit] 2009-06-02 13:27:54,094 DEBUG >>>>> [NIOServerCxn.Facto777ry:21810] >>>>> server.NIOServerCnxn(447): IOException stack trace >>>>>[jun
Re: Errors during shutdown/startup of ZooKeeper
I see. That helps. However, even as warnings, these go on seemingly endlessly. Why do they not get fixed by themselves? What are we doing wrong here? On Tue, Jun 2, 2009 at 2:24 PM, Mahadev Konar wrote: > Hi Nitay, > This is not an error but should be a warning. I have opened up a jira for > it. > > http://issues.apache.org/jira/browse/ZOOKEEPER-428 > > > The message just says that a client is connecting to a server that is > behind > that a server is was connected to earlier. The log should be warn and not > error and should be fixed in the next release. > > mahadev > > On 6/2/09 2:12 PM, "Nitay" wrote: > > > Hey guys, > > > > We are getting a lot of messages like this in HBase: > > > > [junit] 2009-06-02 11:57:23,658 ERROR [NIOServerCxn.Factory:21810] > > server.NIOServerCnxn(514): Client has seen zxid 0xe our last zxid is 0xd > > > > For more context, the block it usually appears in is: > > > > [junit] 2009-06-02 13:27:54,083 INFO [main-SendThread] > > zookeeper.ClientCnxn$SendThread(737): Priming connection to > > java.nio.channels.SocketChannel[connected local=/0:0:0:0:0:0:0:1%0:56511 > > remote=localhost/0:0:0:0:0:0:0:1:21810] > > [junit] 2009-06-02 13:27:54,084 INFO [main-SendThread] > > zookeeper.ClientCnxn$SendThread(889): Server connection successful > > [junit] 2009-06-02 13:27:54,093 INFO [NIOServerCxn.Factory:21810] > > server.NIOServerCnxn(532): Connected to /0:0:0:0:0:0:0:1%0:56511 lastZxid > 16 > > [junit] 2009-06-02 13:27:54,094 ERROR [NIOServerCxn.Factory:21810] > > server.NIOServerCnxn(543): Client has seen zxid 0x10 our last zxid is 0x4 > > [junit] 2009-06-02 13:27:54,094 WARN [NIOServerCxn.Factory:21810] > > server.NIOServerCnxn(444): Exception causing close of session 0x0 due to > > java.io.IOException: Client has seen zxid 0x10 our last zxid is 0x4 > > [junit] 2009-06-02 13:27:54,094 DEBUG [NIOServerCxn.Facto777ry:21810] > > server.NIOServerCnxn(447): IOException stack trace > > [junit] java.io.IOException: Client has seen zxid 0x10 our last zxid > is > > 0x4 > > [junit] at > > > org.apache.zookeeper.server.NIOServerCnxn.readConnectRequest(NIOServerCnxn.jav > > a:544) > > [junit] at > > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:331) > > [junit] at > > > org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:176) > > [junit] 2009-06-02 13:27:54,094 INFO [NIOServerCxn.Factory:21810] > > server.NIOServerCnxn(777): closing session:0x0 NIOServerCnxn: > > java.nio.channels.SocketChannel[connected local=/0:0:0:0:0:0:0:1%0:21810 > > remote=/0:0:0:0:0:0:0:1%0:56511] > > [junit] 2009-06-02 13:27:54,097 WARN [main-SendThread] > > zookeeper.ClientCnxn$SendThread(919): Exception closing session > > 0x121a2a7c43a0002 to sun.nio.ch.selectionkeyi...@2c662b4e > > [junit] java.io.IOException: Read error rc = -1 > > java.nio.DirectByteBuffer[pos=0 lim=4 cap=4] > > [junit] at > > org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653) > > [junit] at > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897) > > [junit] 2009-06-02 13:27:54,097 WARN [main-SendThread] > > zookeeper.ClientCnxn$SendThread(953): Ignoring exception during shutdown > > input > > [junit] java.net.SocketException: Socket is not connected > > [junit] at sun.nio.ch.SocketChannelImpl.shutdown(Native Method) > > [junit] at > > sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640) > > [junit] at > > sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) > > [junit] at > > org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:951) > > [junit] at > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922) > > > > > > This happens in a seemingly endless loop. We are not quite sure what it > > means. Can someone help shed some light on these messages? > > > > Thanks, > > -n > >
Errors during shutdown/startup of ZooKeeper
Hey guys, We are getting a lot of messages like this in HBase: [junit] 2009-06-02 11:57:23,658 ERROR [NIOServerCxn.Factory:21810] server.NIOServerCnxn(514): Client has seen zxid 0xe our last zxid is 0xd For more context, the block it usually appears in is: [junit] 2009-06-02 13:27:54,083 INFO [main-SendThread] zookeeper.ClientCnxn$SendThread(737): Priming connection to java.nio.channels.SocketChannel[connected local=/0:0:0:0:0:0:0:1%0:56511 remote=localhost/0:0:0:0:0:0:0:1:21810] [junit] 2009-06-02 13:27:54,084 INFO [main-SendThread] zookeeper.ClientCnxn$SendThread(889): Server connection successful [junit] 2009-06-02 13:27:54,093 INFO [NIOServerCxn.Factory:21810] server.NIOServerCnxn(532): Connected to /0:0:0:0:0:0:0:1%0:56511 lastZxid 16 [junit] 2009-06-02 13:27:54,094 ERROR [NIOServerCxn.Factory:21810] server.NIOServerCnxn(543): Client has seen zxid 0x10 our last zxid is 0x4 [junit] 2009-06-02 13:27:54,094 WARN [NIOServerCxn.Factory:21810] server.NIOServerCnxn(444): Exception causing close of session 0x0 due to java.io.IOException: Client has seen zxid 0x10 our last zxid is 0x4 [junit] 2009-06-02 13:27:54,094 DEBUG [NIOServerCxn.Facto777ry:21810] server.NIOServerCnxn(447): IOException stack trace [junit] java.io.IOException: Client has seen zxid 0x10 our last zxid is 0x4 [junit] at org.apache.zookeeper.server.NIOServerCnxn.readConnectRequest(NIOServerCnxn.java:544) [junit] at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:331) [junit] at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:176) [junit] 2009-06-02 13:27:54,094 INFO [NIOServerCxn.Factory:21810] server.NIOServerCnxn(777): closing session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/0:0:0:0:0:0:0:1%0:21810 remote=/0:0:0:0:0:0:0:1%0:56511] [junit] 2009-06-02 13:27:54,097 WARN [main-SendThread] zookeeper.ClientCnxn$SendThread(919): Exception closing session 0x121a2a7c43a0002 to sun.nio.ch.selectionkeyi...@2c662b4e [junit] java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0 lim=4 cap=4] [junit] at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897) [junit] 2009-06-02 13:27:54,097 WARN [main-SendThread] zookeeper.ClientCnxn$SendThread(953): Ignoring exception during shutdown input [junit] java.net.SocketException: Socket is not connected [junit] at sun.nio.ch.SocketChannelImpl.shutdown(Native Method) [junit] at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640) [junit] at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:951) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922) This happens in a seemingly endless loop. We are not quite sure what it means. Can someone help shed some light on these messages? Thanks, -n
Re: problems on EC2?
Yes, we are. We currently don't handle SessionExpired very well at all in HBase. There are two things going on in parallel to fix it: 1) Reinitialize the ZooKeeper handler (and everything else that depends on it) on the node in question when a SessionExpired event occurs. 2) Reduce the number of SessionExpired events we get by using Joey's JNI solution. After the various talks about session timeout, different GC flags, etc, we decided to pursue the JNI solution. We plan on contributing his work back to ZooKeeper, under some contrib, so that others can use it. In the really short term, for folks that are seeing it, using the concurrent GC and bumping up the session timeout to 30 seconds or so seems to reduce the frequency of the problem. I'm curious if your problems are the same as ours. You should try tweaking the GC parameters and session timeout to see if the problems you're having are the same as ours. Cheers, -n On Tue, Apr 14, 2009 at 6:34 PM, Ted Dunning wrote: > Very good pointer. Thanks. > > Are you still having your problems? > > On Tue, Apr 14, 2009 at 6:09 PM, Nitay wrote: > > > Hi Ted, > > > > Fellow user coming from HBase. We were recently seeing lots of > > SessionExpired events as well. Check out this mail thread: > > > > > > > http://markmail.org/search/?q=SessionExpired#query:SessionExpired+page:1+mid:gt4c2kn4n4f5s5kw+state:results > > > > Perhaps this might have something to do with what you're seeing. > > > > Cheers, > > -n > > > > On Tue, Apr 14, 2009 at 5:48 PM, Ted Dunning > > wrote: > > > > > We have been using EC2 as a substrate for our search cluster with > > zookeeper > > > as our coordination layer and have been seeing some strange problems. > > > > > > These problems seem to manifest around getting lots of anomalous > > > disconnects > > > and session expirations even though we have the timeout values set to 2 > > > seconds on the server side and 5 seconds on the client side. > > > > > > Has anybody else been seeing this? > > > > > > Is this related to clock jumps in a virtualized setting? > > > > > > On a related note, what is best practice for handling session > expiration? > > > Just deal with it as if it is a new start? > > > > > > > > > -- > Ted Dunning, CTO > DeepDyve > > 111 West Evelyn Ave. Ste. 202 > Sunnyvale, CA 94086 > www.deepdyve.com > 858-414-0013 (m) > 408-773-0220 (fax) >
Re: problems on EC2?
Hi Ted, Fellow user coming from HBase. We were recently seeing lots of SessionExpired events as well. Check out this mail thread: http://markmail.org/search/?q=SessionExpired#query:SessionExpired+page:1+mid:gt4c2kn4n4f5s5kw+state:results Perhaps this might have something to do with what you're seeing. Cheers, -n On Tue, Apr 14, 2009 at 5:48 PM, Ted Dunning wrote: > We have been using EC2 as a substrate for our search cluster with zookeeper > as our coordination layer and have been seeing some strange problems. > > These problems seem to manifest around getting lots of anomalous > disconnects > and session expirations even though we have the timeout values set to 2 > seconds on the server side and 5 seconds on the client side. > > Has anybody else been seeing this? > > Is this related to clock jumps in a virtualized setting? > > On a related note, what is best practice for handling session expiration? > Just deal with it as if it is a new start? >
Re: Semantics of ConnectionLoss exception
Why is it done that way? How am I supposed to reliably detect that my ephemeral nodes are gone? Why not deliver the Session Expired event on the client side after the right time has passed without communication to any server? On Thu, Mar 26, 2009 at 10:58 AM, Mahadev Konar wrote: > > > > Isn't it the case that the client won't get session expired until it's > > able to connect to a server, right? So what might happen is that the > > client loses connection to the server, the server eventually expires the > > client and deletes ephemerals (notifying all watchers) but the client > > won't see the "session expiration" until it is able to reconnect to one > > of the servers. ie the client doesn't know it's been expired until it's > > able to reconnect to the cluster, at which point it's notified that it's > > been expired. > You are right pat! > > mahadev > > > > >> > http://hadoop.apache.org/zookeeper/docs/r3.0.1/zookeeperProgrammers.html > >> Has this information scattered around, but we should put it in the FAQ > >> specifically. > > > > 3.0.1 is a bit old, try this for the latest docs: > > > http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html > > > >> - Is the ZooKeeper handle I'm using dead after this event? > >> Again no. your handle is valid until you get an session expiry event or > you > >> do a zoo_close on your handle. > >> > >> > >> Thanks > >> mahadev > >> > >> > >> > >> > >> On 3/25/09 5:42 PM, "Nitay" wrote: > >> > >>> I'm a little unclear about the ConnectionLoss exception as it's > described in > >>> the FAQ and would like some clarification. > >>> > >>> From the state diagram, http://wiki.apache.org/hadoop/ZooKeeper/FAQ#1, > there > >>> are three events that cause a ConnectionLoss: > >>> > >>> 1) In Connecting state, call close(). > >>> 2) In Connected state, call close(). > >>> 3) In Connected state, get disconnected. > >>> > >>> It's the third one I'm unclear about. > >>> > >>> - Does this event happening mean my ephemeral nodes will go away? > >>> - Is the ZooKeeper handle I'm using dead after this event? Meaning > that, > >>> similar to the SessionExpired case, I need to construct a new > connection > >>> handle to ZooKeeper and take care of the restarting myself. It seems > from > >>> the diagram that this should not be the case. Rather, seeing as the > >>> disconnected event sends the user back to the Connecting state, my > handle > >>> should be fine and the library will keep trying to reconnect to > ZooKeeper > >>> internally? I understand my current operation may have failed, what I'm > >>> asking about is future operations. > >>> > >>> Thanks, > >>> -n > >> > >
Semantics of ConnectionLoss exception
I'm a little unclear about the ConnectionLoss exception as it's described in the FAQ and would like some clarification. >From the state diagram, http://wiki.apache.org/hadoop/ZooKeeper/FAQ#1, there are three events that cause a ConnectionLoss: 1) In Connecting state, call close(). 2) In Connected state, call close(). 3) In Connected state, get disconnected. It's the third one I'm unclear about. - Does this event happening mean my ephemeral nodes will go away? - Is the ZooKeeper handle I'm using dead after this event? Meaning that, similar to the SessionExpired case, I need to construct a new connection handle to ZooKeeper and take care of the restarting myself. It seems from the diagram that this should not be the case. Rather, seeing as the disconnected event sends the user back to the Connecting state, my handle should be fine and the library will keep trying to reconnect to ZooKeeper internally? I understand my current operation may have failed, what I'm asking about is future operations. Thanks, -n
Re: Testing Zookeeper
Joshua, There may already be some JIRAs open regarding this, e.g. https://issues.apache.org/jira/browse/ZOOKEEPER-278. You can assign those to yourself and attach your stuff there if it fits your issue. On Tue, Feb 10, 2009 at 11:44 AM, Mahadev Konar wrote: > HI Joshua, > Feel free to open a jira and attach a patch. > > Please take a look at how to contribute: > > http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute > > Thanks > mahadev > > On 2/10/09 11:34 AM, "Joshua Tuberville" > wrote: > > > To test our zookeeper usage we built a utility class using some of the > methods > > in org.apache.zookeeper.test.ClientBase out of the test folder. This > allows > > testing to be done using any framework JUnit4, JUnit5, TestNG, etc. We > would > > prefer this be in the zookeeper jar. Should I open a JIRA item and > include > > the class? > > > > Thanks, > > Joshua > >