[
https://issues.apache.org/jira/browse/BOOKKEEPER-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095980#comment-13095980
]
Ivan Kelly commented on BOOKKEEPER-63:
--------------------------------------
This patch fails:
Tests in error:
testSyncPublish(org.apache.hedwig.client.TestPubSubClient): Could not
establish connection with ZooKeeper after zk_timeout*2 = 4000 ms. (Default
value for zk_timeout is 2000).
testAsyncPublish(org.apache.hedwig.client.TestPubSubClient): Could not
establish connection with ZooKeeper after zk_timeout*2 = 4000 ms. (Default
value for zk_timeout is 2000).
testMultipleAsyncPublish(org.apache.hedwig.client.TestPubSubClient): Could
not establish connection with ZooKeeper after zk_timeout*2 = 4000 ms. (Default
value for zk_timeout is 2000).
testSyncSubscribe(org.apache.hedwig.client.TestPubSubClient): Could not
establish connection with ZooKeeper after zk_timeout*2 = 4000 ms. (Default
value for zk_timeout is 2000).
testAsyncSubscribe(org.apache.hedwig.client.TestPubSubClient): Could not
establish connection with ZooKeeper after zk_timeout*2 = 4000 ms. (Default
value for zk_timeout is 2000).
testSubscribeAndConsume(org.apache.hedwig.client.TestPubSubClient): Could not
establish connection with ZooKeeper after zk_timeout*2 = 4000 ms. (Default
value for zk_timeout is 2000).
testAsyncSubscribeAndUnsubscribe(org.apache.hedwig.client.TestPubSubClient):
Could not establish connection with ZooKeeper after zk_timeout*2 = 4000 ms.
(Default value for zk_timeout is 2000).
testSyncUnsubscribeWithoutSubscription(org.apache.hedwig.client.TestPubSubClient):
Could not establish connection with ZooKeeper after zk_timeout*2 = 4000 ms.
(Default value for zk_timeout is 2000).
testAsyncSubscribeAndCloseSubscription(org.apache.hedwig.client.TestPubSubClient):
Could not establish connection with ZooKeeper after zk_timeout*2 = 4000 ms.
(Default value for zk_timeout is 2000).
testSecondServer(org.apache.hedwig.server.netty.TestPubSubServer): Could not
establish connection with ZooKeeper after zk_timeout*2 = 4000 ms. (Default
value for zk_timeout is 2000).
testUncaughtExceptionInNettyThread(org.apache.hedwig.server.netty.TestPubSubServer):
Could not establish connection with ZooKeeper after zk_timeout*2 = 4000 ms.
(Default value for zk_timeout is 2000).
testUncaughtExceptionInZKThread(org.apache.hedwig.server.netty.TestPubSubServer):
Could not establish connection with ZooKeeper after zk_timeout*2 = 4000 ms.
(Default value for zk_timeout is 2000).
testInvalidServerConfiguration(org.apache.hedwig.server.netty.TestPubSubServer):
Could not establish connection with ZooKeeper after zk_timeout*2 = 4000 ms.
(Default value for zk_timeout is 2000).
testValidServerConfiguration(org.apache.hedwig.server.netty.TestPubSubServer):
Could not establish connection with ZooKeeper after zk_timeout*2 = 4000 ms.
(Default value for zk_timeout is 2000).
testNonEmptyDirtyLedger(org.apache.hedwig.server.persistence.TestBookkeeperPersistenceManagerWhiteBox)
> Hedwig PubSubServer must wait for its Zookeeper client to be connected upon
> startup
> -----------------------------------------------------------------------------------
>
> Key: BOOKKEEPER-63
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-63
> Project: Bookkeeper
> Issue Type: Bug
> Components: hedwig-server
> Reporter: Matthieu Morel
> Priority: Minor
> Attachments: BOOKKEEPER-63.patch, patch-testcase.txt, patch-v2.txt,
> patch.txt
>
>
> When a PubSubServer is instantiated in *non-standalone* mode, it creates a
> ZkTopicManager which takes a Zookeeper client as an argument.
> Unfortunately, this Zookeeper client may not be connected yet (not in
> CONNECTED state yet), and when this is the case, creation of ZkTopicManager
> fails, leading to failure of the PubSubServer startup.
> Typical error (adapted, line numbers take into account commented patching
> code):
> jjava.io.IOException:
> org.apache.hedwig.exceptions.PubSubException$ServiceDownException:
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode
> = ConnectionLoss for /hedwig/standalone/hosts/x.x.x.x:4080:9876
> at
> org.apache.hedwig.server.netty.PubSubServer.instantiateTopicManager(PubSubServer.java:170)
> at
> org.apache.hedwig.server.netty.PubSubServer$3.run(PubSubServer.java:294)
> at java.lang.Thread.run(Thread.java:680)
> Caused by: org.apache.hedwig.exceptions.PubSubException$ServiceDownException:
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode
> = ConnectionLoss for /hedwig/standalone/hosts/x.x.x.x:4080:9876
> at
> org.apache.hedwig.server.topics.ZkTopicManager$4.safeProcessResult(ZkTopicManager.java:146)
> etc...
> This is particularly problematic for running tests that require to pass a
> config to the PubSubServer.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira