[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475874#comment-13475874
 ] 

Uma Maheswara Rao G commented on BOOKKEEPER-424:
------------------------------------------------

{quote}
Consider in corner case where the client is very busy(in GCing or other cases) 
and got expired before registering new watcher. In this case anyway down the 
layer, after instantiation, bookie is using zkclient for checking env(as shown 
below code) here it would get ZKConnectionLossException and consequently would 
get shutdown.
this.zk = instantiateZookeeperClient(conf);
        checkEnvironment(this.zk);
So IMHO, we can continue without having anymore extra checks. Whats your 
opinion?
{quote}

 Here by luck, we have ZK access due to getting Cookie instance, otherwise, we 
may end up in non-grace full shutdown. i.e, we would have not executed any of 
the clean up activities which are performed in shutdown. 
Also, semantically checkEnvironment check is not for ZK validation, it was for 
fileSystem structure consistency. So, I don't feel it is correct to depending 
on that check. If tomorrow we change the check env conditions, no one will came 
back to this position and care about this zk connection race right. Also it is 
right to me that checkEnvironment  is for other purpose and not for zk handle 
validation.
                
> Bookie start is failing intermittently when zkclient connection delays
> ----------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-424
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-424
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-server
>    Affects Versions: 4.0.0, 4.1.0
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-424-1.patch, BOOKKEEPER-424-2.patch, 
> BOOKKEEPER-424-3.patch
>
>
> I'm seeing the following intermittent failure, when there is a delay in 
> establishing zkclient connection with zkserver. 
> {code}
> org.apache.bookkeeper.bookie.BookieException$InvalidCookieException: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /ledgers/INSTANCEID
>       at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:329)
>       at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:378)
>       at 
> org.apache.bookkeeper.bookie.BookieInitializationTest.testStartBookieWithoutZKServer(BookieInitializationTest.java:253)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>       at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>       at java.lang.reflect.Method.invoke(Unknown Source)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
> KeeperErrorCode = ConnectionLoss for /ledgers/INSTANCEID
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>       at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131)
>       at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
>       at org.apache.bookkeeper.bookie.Bookie.getInstanceId(Bookie.java:346)
>       at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:280)
>       ... 11 more
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to