[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-29?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078592#comment-13078592
 ] 

Ivan Kelly commented on BOOKKEEPER-29:
--------------------------------------

{quote}
We shouldn't expect to have all servers from time zero. One of the reasons why 
we get bookies available through zookeeper is exactly to have the ability to 
remove and add servers dynamically.
{quote}
Well, I don't exactly expect them to exist from time zero as such. But I would 
expect them to exist before the failure. A scenario where you add a bookie 
after a failure and expect the recovery to be seamless seems unrealistic to me.

{quote}
As for how to fix it, I was thinking that we could set a watch on the bookies 
znode before starting the bookie, and blocking until we get the notification. 
How does it sound to you? {quote}
There already exists a watch on the bookie znodes, the bookie watcher on the 
client. I dont know how to wait on the notification without making the bookie 
watcher public. However, the zk.sync() i mentioned in my previous comment could 
be functionally equivalent, no?

> BookieRecoveryTest fails intermittently
> ---------------------------------------
>
>                 Key: BOOKKEEPER-29
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-29
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>             Fix For: 3.4.0
>
>         Attachments: BK-29.diff, BK-29.diff, 
> org.apache.bookkeeper.test.BookieRecoveryTest.txt
>
>
> The failure doesn't hit every time. You have to run, multiple times. From 
> bookkeeper-server, run mvn test -Dtest=BookieRecoveryTest multiple times to 
> repro.
> Test output is attached.
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Running org.apache.bookkeeper.test.BookieRecoveryTest
> log4j:WARN No appenders could be found for logger 
> (org.apache.bookkeeper.test.BaseTestCase).
> log4j:WARN Please initialize the log4j system properly.
> Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.794 sec <<< 
> FAILURE!
> Results :
> Tests in error: 
>   
> testAsyncBookieRecoveryToSpecificBookie[1](org.apache.bookkeeper.test.BookieRecoveryTest)
> Tests run: 8, Failures: 0, Errors: 1, Skipped: 0

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to