[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499254#comment-13499254
 ] 

Yixue (Andrew) Zhu commented on BOOKKEEPER-447:
-----------------------------------------------

I was referring to first-cut of BOOKKEEPER-432, skipList is used as caching 
layer to sort entries, before they make it to entry log or index buffers.
We have run benchmark, it is better (read throughput) than the slab-based 
approach which Aniruddah experimented with. Some of the details of slab-based 
approach may be different though.

When we experimented with thousands of active ledgers per hub, the sync thread 
takes quite a hit while flushing (thousands of files). I am not sure if it is 
good idea to peg the sync interval as 100ms. 
  
                
> Bookie can fail to recover if index pages flushed before ledger flush 
> acknowledged
> ----------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-447
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-447
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-server
>    Affects Versions: 4.2.0
>            Reporter: Yixue (Andrew) Zhu
>            Assignee: Yixue (Andrew) Zhu
>              Labels: patch
>             Fix For: 4.2.0, 4.1.1
>
>         Attachments: BOOKKEEPER-447.diff
>
>
> Bookie index page steal (LedgerCacheImpl::grabCleanPage) can cause index file 
> to reflect unacknowledged entries (due to flushLedger). Suppose ledger and 
> entry fail to flush due to Bookkeeper server crash, it will cause ledger 
> recovery not able to use the bookie afterward, due to 
> InterleavedStorageLedger::getEntry throws IOException.
> If the ackSet bookies all experience this problem (DC environment), the 
> ledger will not be able to recover.
> The problem here essentially a violation of WAL. One reasonable fix is to 
> track ledger flush progress (either per-ledger entry, or per-topic message). 
> Do not flush index pages which tracks entries whose ledger (log) has not been 
> flushed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to