Hello Cameron, that indeed is the question. Asking for a new session is always be possible by simply not specifying the old session credentials, however, the question is whether a client can determine this condition occurred, as opposed to just the inability to connect to the ensemble for other reasons. It is possible that the Zookeeper state is different in this case (e.g., there may be not EXPIRED state, only DISCONNECTED).
But even if so, would we get false positives from that test in case only one Zookeeper of the entire ensemble goes bezerk and it is absolutely correct to get that connection refused? Wouldn't that special condition you observed not just occur in the case of the entire ensemble being wiped out and reset? Wouldn't it be legitimate to expect the clients then go also through a full restart to recover from this grave condition of the coordination service? There is nothing graceful about the entire Zookeeper ensemble being deprived of its data, so the best effort to recover on the client side would probably to set a time limit after which the new session will be opened instead of further attempting to reconnect, i.e., essentially the client decides it is time to restart. I'll have a look into this. It requires some thinking and testing :-) Cheers, --Jürgen On 12.11.2014 06:32, Cameron McKenzie wrote: > Thanks for the explanation Jurgen, > Does the client application (using the ZK client libs) actually get a > notification from the ZK client code that this situation has occurred? i.e. > Would it actually be possible for an application to identify this situation > and ask for a new session? > > What I seemed to see when I was stepping through in a debugger (and what > the logs would imply) was that the reconnect code would just go around > indefinitely. > cheers >
