[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486788#comment-13486788
 ] 

Sijie Guo commented on BOOKKEEPER-447:
--------------------------------------

{quote}
Bookie index page steal (LedgerCacheImpl::grabCleanPage) can cause index file 
to reflect unacknowledged entries (due to flushLedger). Suppose ledger and 
entry fail to flush due to Bookkeeper server crash, it will cause ledger 
recovery not able to use the bookie afterward, due to 
InterleavedStorageLedger::getEntry throws IOException.
{quote}

If failed to flush entry log, the last mark will not be rolled. so the entries 
are still in journal, they would be replayed and added to new entry log files 
and update the ledger index. I assumed that it should not throw IOException 
when getEntry. Could you describe more about the case? is it easy to reproduce 
that?
                
> Bookie can fail to recover if index pages flushed before ledger flush 
> acknowledged
> ----------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-447
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-447
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-server
>    Affects Versions: 4.2.0
>            Reporter: Yixue (Andrew) Zhu
>            Assignee: Robin Dhamankar
>              Labels: patch
>             Fix For: 4.2.0
>
>
> Bookie index page steal (LedgerCacheImpl::grabCleanPage) can cause index file 
> to reflect unacknowledged entries (due to flushLedger). Suppose ledger and 
> entry fail to flush due to Bookkeeper server crash, it will cause ledger 
> recovery not able to use the bookie afterward, due to 
> InterleavedStorageLedger::getEntry throws IOException.
> If the ackSet bookies all experience this problem (DC environment), the 
> ledger will not be able to recover.
> The problem here essentially a violation of WAL. One reasonable fix is to 
> track ledger flush progress (either per-ledger entry, or per-topic message). 
> Do not flush index pages which tracks entries whose ledger (log) has not been 
> flushed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to