On Tue, Jun 14, 2016 at 2:30 PM, Arun M. Krishnakumar <arunm...@gmail.com> wrote:
> Hi Sijie, > > I believe the ZooKeeperClient class handles the server connections and we > haven't faced issues with that. Could you please confirm ? > Yes. It handles session expires and recreates the connections. > > The issue was with the client connection in the AbstractZkLedgerManager > class as you mentioned above. The twitter branch fix seems to recreate the > listeners and reestablish state. Could you please push it to the community > ? > Yes. I will do. > > Thanks, > Arun > > On Tue, Jun 14, 2016 at 2:17 PM, Sijie Guo <si...@apache.org> wrote: > > > Arun, what did you observe? > > > > I think we already handle session expires and zookeeper connection > > recreation on ZooKeeperClient wrapper: > > > > > https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/zookeeper/ZooKeeperClient.java > > > > > > We need to uncomment the code in Line 168. > > > > > > > https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L168 > > > > (The change in twitter's branch does that retries: > > > > > https://github.com/twitter/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L211 > > ) > > > > - Sijie > > > > > > > > On Wed, Jun 8, 2016 at 12:46 AM, Arun M. Krishnakumar < > arunm...@gmail.com> > > wrote: > > > > > Thanks for the pointer, Uma Gangumalla. > > > > > > Could you please give an overview of the fix in HDFS-3562. > > > > > > In the case of Bookkeeper-client, the ReadOnlyLedgerHandle constructs a > > > watcher on the relevant Zookeeper nodes. The interesting things are the > > > watches created by the ReadOnlyLedgerHandle on the relevant zookeeper > > > nodes. We would lose the notifications that happen during the timeout. > > What > > > would be the best way to proceed in such scenarios ? Should we > > reconstruct > > > the state ? Is there any other such state that needs to be considered ? > > > > > > Thanks, > > > Arun > > > > > > On Tue, Jun 7, 2016 at 3:40 PM, Uma gangumalla <umamah...@apache.org> > > > wrote: > > > > > > > Good point, Venkateswara Rao. > > > > > > > > Some time ago, we worked on this scenarios. Here is a patch > > > > available. HDFS-3562 > > > > Here we just tried to keep at application side. But as a long term > > > solution > > > > this could be placed at BK side as utility module? So that all > > > applications > > > > can benefit. > > > > > > > > > > > > Note: As I remember RetryableZookeeper idea was taken from HBase. > > > > > > > > Regards, > > > > Uma > > > > > > > > On Mon, Jun 6, 2016 at 9:42 AM, Venkateswara Rao Jujjuri < > > > > jujj...@gmail.com> > > > > wrote: > > > > > > > > > If a bookie looses connection with ZK, connection gets > reestablished > > > and > > > > > life goes on. How are we handling it on the client case? Should we > > > retry > > > > at > > > > > library level? > > > > > or leave it up to the application? Any discussion/thoughts on this? > > > > > > > > > > -- > > > > > Jvrao > > > > > --- > > > > > First they ignore you, then they laugh at you, then they fight you, > > > then > > > > > you win. - Mahatma Gandhi > > > > > > > > > > > > > > >