[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195727#comment-13195727
 ] 

[email protected] commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/
-----------------------------------------------------------

(Updated 2012-01-29 09:58:30.513187)


Review request for bookkeeper.


Changes
-------

we need to check the ledger metadata status before proceed recovery action.

for those OPENED ledgers,
1) whose last ensemble contains the failed bookie, we should not proceed 
recovery action. since we can't promise last entry to be fully replicated. 
(also there may be other side effects)
2) whose last ensemble doesn't contain the failed bookie, it is safe to proceed 
recovery action.

for those IN_RECOVERY ledgers, we have to check whether last ensemble contains 
the failed bookie. if it is, the recovery tool has to help closing this ledger, 
since the normal bookkeeper client may fail to close it. (a corn case: 3 
bookies (bk1, bk2, bk3), quorum size 3, ensemble size 3. no entry is written. 
bk3 is failed. bk1 and bk2 returns NoEntry, bk3 returns HandleNotAvailable. 
ledger can't be closed.) 

for 2) case of OPENED ledgers, both PendingAddOp and BookKeeperAdmin needs to 
rereadMetadata when encountering BADVERSION and try to resolve such confliction 
to avoid #close it.


Summary
-------

Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not 
get notified of this update, so it will try to write out its own ledger 
metadata, only to fail with KeeperException.BadVersion. This effectively fences 
all write operations on the LedgerHandle (close and addEntry). close will fail 
for obvious reasons. addEntry will fail once it gets to the failed bookie in 
the schedule, tries to write, fails, selects a new bookie and tries to update 
ledger metadata.

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
Also, uncomment addEntry in 
TestFencing#testFencingInteractionWithBookieRecovery()


This addresses bug BOOKKEEPER-112.
    https://issues.apache.org/jira/browse/BOOKKEEPER-112


Diffs (updated)
-----

  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java 
5bb37c3 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
 37623dc 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 
547e240 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java
 b403aa1 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java 
56186ab 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java
 4625bbb 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 
29070eb 
  
bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java
 43e999d 
  
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java
 8526db5 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 
015e4e4 
  
bookkeeper-server/src/test/java/org/apache/bookkeeper/test/LedgerOpenTest.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/3472/diff


Testing
-------


Thanks,

Sijie


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that 
> ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Ivan Kelly
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will 
> not get notified of this update, so it will try to write out its own ledger 
> metadata, only to fail with KeeperException.BadVersion. This effectively 
> fences all write operations on the LedgerHandle (close and addEntry). close 
> will fail for obvious reasons. addEntry will fail once it gets to the failed 
> bookie in the schedule, tries to write, fails, selects a new bookie and tries 
> to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when 
> done
> Also, uncomment addEntry in 
> TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to