[
https://issues.apache.org/jira/browse/ZOOKEEPER-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13106898#comment-13106898
]
Matthias Spycher commented on ZOOKEEPER-1159:
---------------------------------------------
Looking at the current implementation, if a SessionExpiredException was thrown,
then the expired event is enqueued, as is the event-of-death for the event
thread; and isAlive() is false. So, as indicated by Camille, your patch doesn't
appear to have the affect you expect.
As for the race conditions. A call to close() will put the client in the
closing mode, but the state may end up being CLOSED, CONNECTING or CONNECTED as
the send thread doesn't enforce strict state transitions. The client will
close, but state.isAlive() may still return true after the send thread has
died. Note that this isn't easy to test, but you can step through it with a
debugger to see how CLOSED can easily be stomped by CONNECTING (especially with
the random sleep happening after a test for closing when isFirstConnect is
false).
Another, rather exotic case (hopefully doesn't ever happen), is when a
Throwable isn't caught, e.g. an error that is not an Exception.
In any case, I'm still not sure whether a dying send-thread should indicate the
session has expired. I looked at the state diagram, but it seems to conflict
with what I see in the code.
> ClientCnxn does not propagate session expiration indication
> -----------------------------------------------------------
>
> Key: ZOOKEEPER-1159
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1159
> Project: ZooKeeper
> Issue Type: Bug
> Components: java client
> Affects Versions: 3.4.0
> Reporter: Andrew Purtell
> Priority: Blocker
> Fix For: 3.4.0
>
>
> ClientCnxn does not always propagate session expiration indication up to
> clients. If a reconnection attempt fails because the session has since
> expired, the KeeperCode is still Disconnected, but shouldn't it be set to
> Expired? Perhaps like so:
> {code}
> --- a/src/java/main/org/apache/zookeeper/ClientCnxn.java
> +++ b/src/java/main/org/apache/zookeeper/ClientCnxn.java
> @@ -1160,6 +1160,7 @@ public class ClientCnxn {
> clientCnxnSocket.doTransport(to, pendingQueue,
> outgoingQueue);
>
> } catch (Exception e) {
> + Event.KeeperState eventState =
> Event.KeeperState.Disconnected;
> if (closing) {
> if (LOG.isDebugEnabled()) {
> // closing so this is expected
> @@ -1172,6 +1173,7 @@ public class ClientCnxn {
> // this is ugly, you have a better way speak up
> if (e instanceof SessionExpiredException) {
> LOG.info(e.getMessage() + ", closing socket
> connection");
> + eventState = Event.KeeperState.Expired;
> } else if (e instanceof SessionTimeoutException) {
> LOG.info(e.getMessage() + RETRY_CONN_MSG);
> } else if (e instanceof EndOfStreamException) {
> @@ -1191,7 +1193,7 @@ public class ClientCnxn {
> if (state.isAlive()) {
> eventThread.queueEvent(new WatchedEvent(
> Event.EventType.None,
> - Event.KeeperState.Disconnected,
> + eventState,
> null));
> }
> clientCnxnSocket.updateNow();
> {code}
> This affects HBase. HBase master and region server processes will shut down
> by design if their session has expired, but will attempt to reconnect if they
> think they have been disconnected. The above prevents proper termination.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira