Re: What should I do with SyncDisconnected

Ivan Kelly Thu, 14 Mar 2013 14:30:51 -0700

@kishore
> * First, How do clients know where to send the request, are they monitoring
> some ephemeral znodes in zookeeper. If yes, then after session timeout some
> other server should notice the ephemeral znode disappearing  and recreate
> another ephemeral znode. Clients should not start sending request to new
> server, basically  the server which is disconnected from zk permanently
> will never get any new requests.
The client looks up the owner of the partition before making a
request. I could have the client monitor ZK and cancel the request if
the owner changes. Alternatively, I could have a the client requests
timeout after a period and then recheck the ownership.


> * Second, how much tolerance do you have for downtime? if you are ok for
> not a server to serve a partition for X seconds ( session timeout), then on
> SyncDisconnected you can stop accepting requests and resume on
> SyncConnected.
The server will never receive SessionExpired though, so my server
needs to be aware of the session timeout. This is actually what we're
doing; Waiting X seconds for SyncConnected and if it doesn't come,
shutting down. This feels kludgy though.

@Jordan
> > SyncDisconnected can occur for a variety of reasons. It's in the class of
> > recoverable errors. Your app needs to go into a waiting state until
> > SysConnected is retrieved again or SessionExpired. Have you read
> > http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling ?
But how long to wait? If the server is truly partitioned from ZK, then
I'll wait forever, and the client request will be hung forever.

> >
> > You should consider using one of the high level ZooKeeper frameworks (such
> > as Curator which I wrote).
Conceptually the issue would still exist though. A high level library
doesn't solve the problem if the problem can't be solved with raw
zookeeper.

-Ivan

Re: What should I do with SyncDisconnected

Reply via email to