[
https://issues.apache.org/jira/browse/BOOKKEEPER-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471423#comment-13471423
]
Uma Maheswara Rao G commented on BOOKKEEPER-424:
------------------------------------------------
@Rakesh, thanks for the patch.
I have a comment on the patch.
{code}
// register watcher for receiving expired event
+ newZk.register(watcher);
return newZk;
{code}
There will be a small race here, which may resulted to miss the events.
Race is between event and reregistration. Original newConnectedZK method events
from ZKutil will not handle any expire events. So, just before reregistering if
it gets expired event, we may miss that event handling.
later operations may thow conn loss exception and may miss proper cleanup.
So, how about having isConnected check after reregistration of watcher?
if ZK is in non connected state and ZK state has expired event then shutdown as
watcher event does the same. This way we will not miss any events right? (
assuming reresistration will not reset any event states at ZK obj)
> Bookie start is failing intermittently when zkclient connection delays
> ----------------------------------------------------------------------
>
> Key: BOOKKEEPER-424
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-424
> Project: Bookkeeper
> Issue Type: Bug
> Components: bookkeeper-server
> Affects Versions: 4.0.0, 4.1.0
> Reporter: Rakesh R
> Assignee: Rakesh R
> Fix For: 4.2.0
>
> Attachments: BOOKKEEPER-424-1.patch, BOOKKEEPER-424.patch
>
>
> I'm seeing the following intermittent failure, when there is a delay in
> establishing zkclient connection with zkserver.
> {code}
> org.apache.bookkeeper.bookie.BookieException$InvalidCookieException:
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode
> = ConnectionLoss for /ledgers/INSTANCEID
> at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:329)
> at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:378)
> at
> org.apache.bookkeeper.bookie.BookieInitializationTest.testStartBookieWithoutZKServer(BookieInitializationTest.java:253)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> at java.lang.reflect.Method.invoke(Unknown Source)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at
> org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /ledgers/INSTANCEID
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
> at org.apache.bookkeeper.bookie.Bookie.getInstanceId(Bookie.java:346)
> at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:280)
> ... 11 more
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira