[
https://issues.apache.org/jira/browse/BOOKKEEPER-294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295029#comment-13295029
]
Ivan Kelly commented on BOOKKEEPER-294:
---------------------------------------
yes, kill -9 will skip shutdown hook, but I think this is ok. If you're using
kill -9 there's a problem. You shouldn't be starting a bookie right away after
that. I think having to wait in situations like this is preferable than
possibly being able to delete someone else's availability znode.
The solution in the patch is actually broken. Bookie#start is called before
NIOServerFactory#start, which means that if you try to start a bookie on a
machine where the bookie is already running, the running bookie's availability
znode is deleted, the new bookie creates one, the new bookie tries to start
NIOServerFactory and fails as the socket is already bound, so crashes, taking
it's availability znode with it. The initial bookie is now running, without a
availability znode, so noone can contact it.
If just the shutdown hook isn't enough, I suggest checking if the znode exists
and if it does, Thread.sleeping for zkTimeout, and trying again. Again though,
kill -9 should be a very rare case.
> Not able to start the bookkeeper before the ZK session timeout.
> ---------------------------------------------------------------
>
> Key: BOOKKEEPER-294
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-294
> Project: Bookkeeper
> Issue Type: Bug
> Components: bookkeeper-server
> Affects Versions: 4.1.0
> Reporter: Gopinathan A
> Assignee: Rakesh R
> Fix For: 4.2.0, 4.1.1
>
> Attachments: BOOKKEEPER-294.1.patch, BOOKKEEPER-294.2.patch,
> BOOKKEEPER-294.3.patch, BOOKKEEPER-294.4.patch, BOOKKEEPER-294.patch,
> BOOKKEEPER-294.patch
>
>
> Not able to start the bookkeeper before the ZK session timeout.
> Here i killed the bookie and started again.
> {noformat}
> 2012-06-12 20:00:25,220 - INFO [main:LedgerCache@65] - openFileLimit is 900,
> pageSize is 8192, pageLimit is 456781
> 2012-06-12 20:00:25,238 - ERROR [main:Bookie@453] - ZK exception registering
> ephemeral Znode for Bookie!
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode =
> NodeExists for /ledgers/available/10.18.40.216:3181
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:778)
> at org.apache.bookkeeper.bookie.Bookie.registerBookie(Bookie.java:450)
> at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:348)
> at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:64)
> at org.apache.bookkeeper.proto.BookieServer.main(BookieServer.java:249)
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira