Resurrecting this thread, because the patch looks a good candidate for 3.9.3 release.
What do you guys think? Andor On Fri, Sep 1, 2023 at 4:54 AM Kezhu Wang <kez...@gmail.com> wrote: > Hi tison, > > Thank you for your information. > > > what will happen if the client decides session expired but the server > hold a valid session and reconnect > > This will not happen inside a single `ZooKeeper` client. Once a client > concludes a session as expired it will not try to establish connection > anymore. > > I rarely see usage of session transition[1] from one client to > another. Theoretically, it is possible to reestablish a session > expired solely by client. But I don't think it is the main focus of > ZooKeeper. Client has to do a quorum operation to gain "happens-before > relation" or linearizability[2] on the data tree in session transition > regardless of this feature. That is, there is no need to prove such a > relation from my point of review. If this is a concern, I think we > could also make this feature a configurable option(including the > `4/3`) in `ZooKeeperOptions`[3]. > > > Curator has done this client-side expiration with a similar algorithm > for a long time and I didn't hear any issues reported. So such a solution > can be battle-tested. > > I was actually somewhat surprised by such a lack in ZooKeeper in my > first attempt to fix this `endless connection loss`[4]. :-) > > [1]: > https://github.com/apache/zookeeper/blob/03a36d08e257c43e8377e5549d5524805fc6b8bb/zookeeper-server/src/main/java/org/apache/zookeeper/ZooKeeper.java#L851 > [2]: > https://zookeeper.apache.org/doc/r3.9.0/zookeeperInternals.html#sc_consistency > [3]: > https://github.com/apache/zookeeper/pull/2001/files#diff-e19fc4f18a3bb65b0ecf90d98d01cb2d7705afffc0249d7e3a6f8c2655cfd702R32 > [4]: > https://github.com/apache/zookeeper/pull/1847#pullrequestreview-1433893017 > > Best, > Kezhu Wang > > On Fri, Sep 1, 2023 at 4:53 PM tison <wander4...@gmail.com> wrote: > > > > IIUC the major issue here is what will happen if the client decides > session > > expired but the server hold a valid session and reconnect. > > > > 4/3 time may best effort do the expiration after the server expires the > > session, but we need to prove a happens-before relation or think of the > > issues described above. > > > > However, Curator has done this client-side expiration with a similar > > algorithm for a long time and I didn't hear any issues reported. So such > a > > solution can be battle-tested. > > > > Best, > > tison. > > > > > > Kezhu Wang <kez...@gmail.com> 于2023年9月1日周五 16:24写道: > > > > > Hi all, > > > > > > ZooKeeper session will expire approximately after negotiated session > > > timeout. Currently, client will learn this after successful contact to > > > ZooKeeper cluster. This exposes an endless client side connection loss > > > when client can't reach ZooKeeper cluster due to either incomplete > > > connection string or whole cluster downtime. > > > > > > There is a `SessionTimeoutException` in `CliientCnxn`, but it never > > > counts as session expiration. > > > > > > Possibly at least four jira issues reported the behavior described > above. > > > > > > * ZOOKEEPER-2188[1]: client connection hung up because of dead loop > > > * ZOOKEEPER-4412[2]: client blocked too long before session timeout > > > * ZOOKEEPER-4508[3]: ZooKeeper client run to endless loop in > > > ClientCnxn.SendThread.run if all server down > > > * ZOOKEEPER-4692[4]: Handle SessionTimeoutException in Java client > > > > > > I propose to add an `expirationTimeout` in `ClientCnxn` to deal with > > > this. The value could be approximately `4/3` of `connectTimeout` or > > > `negotiatedSessionTimeout` depending on stage. I opened a pr[5] for > > > evaluation. > > > > > > Any suggestions ? Thanks! > > > > > > [1]: https://issues.apache.org/jira/browse/ZOOKEEPER-2188 > > > [2]: https://issues.apache.org/jira/browse/ZOOKEEPER-4412 > > > [3]: https://issues.apache.org/jira/browse/ZOOKEEPER-4508 > > > [4]: https://issues.apache.org/jira/browse/ZOOKEEPER-4692 > > > [5]: https://github.com/apache/zookeeper/pull/2058 > > > > > > Best, > > > Kezhu Wang > > > >