[
https://issues.apache.org/jira/browse/BOOKKEEPER-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475874#comment-13475874
]
Uma Maheswara Rao G commented on BOOKKEEPER-424:
------------------------------------------------
{quote}
Consider in corner case where the client is very busy(in GCing or other cases)
and got expired before registering new watcher. In this case anyway down the
layer, after instantiation, bookie is using zkclient for checking env(as shown
below code) here it would get ZKConnectionLossException and consequently would
get shutdown.
this.zk = instantiateZookeeperClient(conf);
checkEnvironment(this.zk);
So IMHO, we can continue without having anymore extra checks. Whats your
opinion?
{quote}
Here by luck, we have ZK access due to getting Cookie instance, otherwise, we
may end up in non-grace full shutdown. i.e, we would have not executed any of
the clean up activities which are performed in shutdown.
Also, semantically checkEnvironment check is not for ZK validation, it was for
fileSystem structure consistency. So, I don't feel it is correct to depending
on that check. If tomorrow we change the check env conditions, no one will came
back to this position and care about this zk connection race right. Also it is
right to me that checkEnvironment is for other purpose and not for zk handle
validation.
> Bookie start is failing intermittently when zkclient connection delays
> ----------------------------------------------------------------------
>
> Key: BOOKKEEPER-424
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-424
> Project: Bookkeeper
> Issue Type: Bug
> Components: bookkeeper-server
> Affects Versions: 4.0.0, 4.1.0
> Reporter: Rakesh R
> Assignee: Rakesh R
> Fix For: 4.2.0
>
> Attachments: BOOKKEEPER-424-1.patch, BOOKKEEPER-424-2.patch,
> BOOKKEEPER-424-3.patch
>
>
> I'm seeing the following intermittent failure, when there is a delay in
> establishing zkclient connection with zkserver.
> {code}
> org.apache.bookkeeper.bookie.BookieException$InvalidCookieException:
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode
> = ConnectionLoss for /ledgers/INSTANCEID
> at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:329)
> at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:378)
> at
> org.apache.bookkeeper.bookie.BookieInitializationTest.testStartBookieWithoutZKServer(BookieInitializationTest.java:253)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> at java.lang.reflect.Method.invoke(Unknown Source)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at
> org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /ledgers/INSTANCEID
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
> at org.apache.bookkeeper.bookie.Bookie.getInstanceId(Bookie.java:346)
> at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:280)
> ... 11 more
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira