[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295434#comment-13295434
 ] 

Uma Maheswara Rao G commented on BOOKKEEPER-294:
------------------------------------------------

{code}
I think we need some file locking on the data directories to prevent two 
process accessing the data directories at the same time. but it would be in 
another jira.
{code}
This sounds very good point Sijie. NN also has this file lock concept for 
protecting its storage directories from others while it is running. If others 
also agrees to introduce this file locking fir directories, feel free to assign 
this task to me.


@Sijie

{code}
A better sequence to start the bookie is first start bookie, start NIOServer 
and register bookie.

If the start sequence is described as above, we could prevent two bookie 
servers running at same port, which could achieve the assumption I commented 
before. Even we could ensure such assumption, I prefer the wait/sleep proposal 
which is a safer way.
{code}

But both Bookie and NIOServer are threads. Again we may have to wait for their 
successful initialization to register right. I don't like this wait here again.
Currently DeathWatcher is checking for their successful startups and shutting 
down if not started. Am i missing some thing from ur case?

How about waiting for DeathWatcherInterval+, Since NIOServer can start on same 
bookie, It might have failed. So, DW should have identified after that Interval 
priod and will kill the fake bookie. If after DeathWatcherInterval+ also not 
killed means, that will be valid bookie. So, register bookei can wait for 
DeathWatcherInterval+ before deleting existig node? I am not sure this is a 
good suggestion, just a thought.
                
> Not able to start the bookkeeper before the ZK session timeout.
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-294
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-294
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-server
>    Affects Versions: 4.1.0
>            Reporter: Gopinathan A
>            Assignee: Rakesh R
>             Fix For: 4.2.0, 4.1.1
>
>         Attachments: BOOKKEEPER-294.1.patch, BOOKKEEPER-294.2.patch, 
> BOOKKEEPER-294.3.patch, BOOKKEEPER-294.4.patch, BOOKKEEPER-294.patch, 
> BOOKKEEPER-294.patch
>
>
> Not able to start the bookkeeper before the ZK session timeout.
> Here i killed the bookie and started again.
> {noformat}
> 2012-06-12 20:00:25,220 - INFO  [main:LedgerCache@65] - openFileLimit is 900, 
> pageSize is 8192, pageLimit is 456781
> 2012-06-12 20:00:25,238 - ERROR [main:Bookie@453] - ZK exception registering 
> ephemeral Znode for Bookie!
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
> NodeExists for /ledgers/available/10.18.40.216:3181
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>       at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:778)
>       at org.apache.bookkeeper.bookie.Bookie.registerBookie(Bookie.java:450)
>       at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:348)
>       at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:64)
>       at org.apache.bookkeeper.proto.BookieServer.main(BookieServer.java:249)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to