Hi all, ZooKeeper session will expire approximately after negotiated session timeout. Currently, client will learn this after successful contact to ZooKeeper cluster. This exposes an endless client side connection loss when client can't reach ZooKeeper cluster due to either incomplete connection string or whole cluster downtime.
There is a `SessionTimeoutException` in `CliientCnxn`, but it never counts as session expiration. Possibly at least four jira issues reported the behavior described above. * ZOOKEEPER-2188[1]: client connection hung up because of dead loop * ZOOKEEPER-4412[2]: client blocked too long before session timeout * ZOOKEEPER-4508[3]: ZooKeeper client run to endless loop in ClientCnxn.SendThread.run if all server down * ZOOKEEPER-4692[4]: Handle SessionTimeoutException in Java client I propose to add an `expirationTimeout` in `ClientCnxn` to deal with this. The value could be approximately `4/3` of `connectTimeout` or `negotiatedSessionTimeout` depending on stage. I opened a pr[5] for evaluation. Any suggestions ? Thanks! [1]: https://issues.apache.org/jira/browse/ZOOKEEPER-2188 [2]: https://issues.apache.org/jira/browse/ZOOKEEPER-4412 [3]: https://issues.apache.org/jira/browse/ZOOKEEPER-4508 [4]: https://issues.apache.org/jira/browse/ZOOKEEPER-4692 [5]: https://github.com/apache/zookeeper/pull/2058 Best, Kezhu Wang