Resurrecting this thread, because the patch looks a good candidate for
3.9.3 release.

What do you guys think?

Andor



On Fri, Sep 1, 2023 at 4:54 AM Kezhu Wang <kez...@gmail.com> wrote:

> Hi tison,
>
> Thank you for your information.
>
> > what will happen if the client decides session expired but the server
> hold a valid session and reconnect
>
> This will not happen inside a single `ZooKeeper` client. Once a client
> concludes a session as expired it will not try to establish connection
> anymore.
>
> I rarely see usage of session transition[1] from one client to
> another. Theoretically, it is possible to reestablish a session
> expired solely by client. But I don't think it is the main focus of
> ZooKeeper. Client has to do a quorum operation to gain "happens-before
> relation" or linearizability[2] on the data tree in session transition
> regardless of this feature. That is, there is no need to prove such a
> relation from my point of review. If this is a concern, I think we
> could also make this feature a configurable option(including the
> `4/3`)  in `ZooKeeperOptions`[3].
>
> > Curator has done this client-side expiration with a similar algorithm
> for a long time and I didn't hear any issues reported. So such a solution
> can be battle-tested.
>
> I was actually somewhat surprised by such a lack in ZooKeeper in my
> first attempt to fix this `endless connection loss`[4]. :-)
>
> [1]:
> https://github.com/apache/zookeeper/blob/03a36d08e257c43e8377e5549d5524805fc6b8bb/zookeeper-server/src/main/java/org/apache/zookeeper/ZooKeeper.java#L851
> [2]:
> https://zookeeper.apache.org/doc/r3.9.0/zookeeperInternals.html#sc_consistency
> [3]:
> https://github.com/apache/zookeeper/pull/2001/files#diff-e19fc4f18a3bb65b0ecf90d98d01cb2d7705afffc0249d7e3a6f8c2655cfd702R32
> [4]:
> https://github.com/apache/zookeeper/pull/1847#pullrequestreview-1433893017
>
> Best,
> Kezhu Wang
>
> On Fri, Sep 1, 2023 at 4:53 PM tison <wander4...@gmail.com> wrote:
> >
> > IIUC the major issue here is what will happen if the client decides
> session
> > expired but the server hold a valid session and reconnect.
> >
> > 4/3 time may best effort do the expiration after the server expires the
> > session, but we need to prove a happens-before relation or think of the
> > issues described above.
> >
> > However, Curator has done this client-side expiration with a similar
> > algorithm for a long time and I didn't hear any issues reported. So such
> a
> > solution can be battle-tested.
> >
> > Best,
> > tison.
> >
> >
> > Kezhu Wang <kez...@gmail.com> 于2023年9月1日周五 16:24写道:
> >
> > > Hi all,
> > >
> > > ZooKeeper session will expire approximately after negotiated session
> > > timeout. Currently, client will learn this after successful contact to
> > > ZooKeeper cluster. This exposes an endless client side connection loss
> > > when client can't reach ZooKeeper cluster due to either incomplete
> > > connection string or whole cluster downtime.
> > >
> > > There is a `SessionTimeoutException` in `CliientCnxn`, but it never
> > > counts as session expiration.
> > >
> > > Possibly at least four jira issues reported the behavior described
> above.
> > >
> > > * ZOOKEEPER-2188[1]: client connection hung up because of dead loop
> > > * ZOOKEEPER-4412[2]: client blocked too long before session timeout
> > > * ZOOKEEPER-4508[3]: ZooKeeper client run to endless loop in
> > > ClientCnxn.SendThread.run if all server down
> > > * ZOOKEEPER-4692[4]: Handle SessionTimeoutException in Java client
> > >
> > > I propose to add an `expirationTimeout` in `ClientCnxn` to deal with
> > > this. The value could be approximately `4/3` of `connectTimeout` or
> > > `negotiatedSessionTimeout` depending on stage. I opened a pr[5] for
> > > evaluation.
> > >
> > > Any suggestions ? Thanks!
> > >
> > > [1]: https://issues.apache.org/jira/browse/ZOOKEEPER-2188
> > > [2]: https://issues.apache.org/jira/browse/ZOOKEEPER-4412
> > > [3]: https://issues.apache.org/jira/browse/ZOOKEEPER-4508
> > > [4]: https://issues.apache.org/jira/browse/ZOOKEEPER-4692
> > > [5]: https://github.com/apache/zookeeper/pull/2058
> > >
> > > Best,
> > > Kezhu Wang
> > >
>

Reply via email to