[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated BOOKKEEPER-447:
----------------------------------

    Attachment: 0001-BOOKKEEPER-447-LedgerCacheImpl-waits-on-semaphore-no.patch

Implemented semaphore solution and added test case which triggers the problem 
without the flush. I haven't checked how it affects performance, but I think it 
will improve performance if anything, as a flush initialiated by the 
LedgerCacheImpl would interfere with an SyncThread flush, causing more disk 
head movement.
                
> Bookie can fail to recover if index pages flushed before ledger flush 
> acknowledged
> ----------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-447
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-447
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-server
>    Affects Versions: 4.2.0
>            Reporter: Yixue (Andrew) Zhu
>            Assignee: Yixue (Andrew) Zhu
>              Labels: patch
>             Fix For: 4.2.0, 4.1.1
>
>         Attachments: 
> 0001-BOOKKEEPER-447-LedgerCacheImpl-waits-on-semaphore-no.patch, 
> BOOKKEEPER-447.diff
>
>
> Bookie index page steal (LedgerCacheImpl::grabCleanPage) can cause index file 
> to reflect unacknowledged entries (due to flushLedger). Suppose ledger and 
> entry fail to flush due to Bookkeeper server crash, it will cause ledger 
> recovery not able to use the bookie afterward, due to 
> InterleavedStorageLedger::getEntry throws IOException.
> If the ackSet bookies all experience this problem (DC environment), the 
> ledger will not be able to recover.
> The problem here essentially a violation of WAL. One reasonable fix is to 
> track ledger flush progress (either per-ledger entry, or per-topic message). 
> Do not flush index pages which tracks entries whose ledger (log) has not been 
> flushed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to