Yixue (Andrew) Zhu created BOOKKEEPER-447:
---------------------------------------------

             Summary: Bookie can fail to recover if index pages flushed before 
ledger flush acknowledged
                 Key: BOOKKEEPER-447
                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-447
             Project: Bookkeeper
          Issue Type: Bug
          Components: bookkeeper-server
    Affects Versions: 4.2.0
            Reporter: Yixue (Andrew) Zhu
             Fix For: 4.2.0


Bookie index page steal (LedgerCacheImpl::grabCleanPage) can cause index file 
to reflect unacknowledged entries (due to flushLedger). Suppose ledger and 
entry fail to flush due to Bookkeeper server crash, it will cause ledger 
recovery not able to use the bookie afterward, due to 
InterleavedStorageLedger::getEntry throws IOException.
If the ackSet bookies all experience this problem (DC environment), the ledger 
will not be able to recover.
The problem here essentially a violation of WAL. One reasonable fix is to track 
ledger flush progress (either per-ledger entry, or per-topic message). Do not 
flush index pages which tracks entries whose ledger (log) has not been flushed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to