Hi Sijie,

I believe the ZooKeeperClient class handles the server connections and we
haven't faced issues with that. Could you please confirm ?

The issue was with the client connection in the AbstractZkLedgerManager
class as you mentioned above. The twitter branch fix seems to recreate the
listeners and reestablish state. Could you please push it to the community ?

Thanks,
Arun

On Tue, Jun 14, 2016 at 2:17 PM, Sijie Guo <si...@apache.org> wrote:

> Arun, what did you observe?
>
> I think we already handle session expires and zookeeper connection
> recreation on ZooKeeperClient wrapper:
>
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/zookeeper/ZooKeeperClient.java
>
>
> We need to uncomment the code in Line 168.
>
>
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L168
>
> (The change in twitter's branch does that retries:
>
> https://github.com/twitter/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L211
> )
>
> - Sijie
>
>
>
> On Wed, Jun 8, 2016 at 12:46 AM, Arun M. Krishnakumar <arunm...@gmail.com>
> wrote:
>
> > Thanks for the pointer, Uma Gangumalla.
> >
> > Could you please give an overview of the fix in HDFS-3562.
> >
> > In the case of Bookkeeper-client, the ReadOnlyLedgerHandle constructs a
> > watcher on the relevant Zookeeper nodes. The interesting things are the
> > watches created by the ReadOnlyLedgerHandle on the relevant zookeeper
> > nodes. We would lose the notifications that happen during the timeout.
> What
> > would be the best way to proceed in such scenarios ? Should we
> reconstruct
> > the state ? Is there any other such state that needs to be considered ?
> >
> > Thanks,
> > Arun
> >
> > On Tue, Jun 7, 2016 at 3:40 PM, Uma gangumalla <umamah...@apache.org>
> > wrote:
> >
> > > Good point, Venkateswara Rao.
> > >
> > > Some time ago, we worked on this scenarios. Here is a patch
> > > available. HDFS-3562
> > > Here we just tried to keep at application side. But as a long term
> > solution
> > > this could be placed at BK side as utility module? So that all
> > applications
> > > can benefit.
> > >
> > >
> > > Note: As I remember RetryableZookeeper idea was taken from HBase.
> > >
> > > Regards,
> > > Uma
> > >
> > > On Mon, Jun 6, 2016 at 9:42 AM, Venkateswara Rao Jujjuri <
> > > jujj...@gmail.com>
> > > wrote:
> > >
> > > > If a bookie looses connection with ZK, connection gets reestablished
> > and
> > > > life goes on. How are we handling it on the client case? Should we
> > retry
> > > at
> > > > library level?
> > > > or leave it up to the application? Any discussion/thoughts on this?
> > > >
> > > > --
> > > > Jvrao
> > > > ---
> > > > First they ignore you, then they laugh at you, then they fight you,
> > then
> > > > you win. - Mahatma Gandhi
> > > >
> > >
> >
>

Reply via email to