[
https://issues.apache.org/jira/browse/HBASE-10466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yunfan Zhong updated HBASE-10466:
---------------------------------
Summary: Bugs that make flushes skipped during HRegion close could cause
data loss (was: Bugs that cause flushes being skipped during HRegion close
could cause data loss)
> Bugs that make flushes skipped during HRegion close could cause data loss
> -------------------------------------------------------------------------
>
> Key: HBASE-10466
> URL: https://issues.apache.org/jira/browse/HBASE-10466
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.89-fb
> Reporter: Yunfan Zhong
> Priority: Critical
> Fix For: 0.89-fb
>
> Attachments:
> Fix-bugs-that-causes-flushes-being-skipped-during-re.patch
>
>
> During region close, there are two flushes to ensure nothing is persisted in
> memory. When there is data in current memstore only, 1 flush is required.
> When there is data also in memstore's snapshot, 2 flushes are essential
> otherwise we have data loss. However, recently we found two bugs that lead to
> at least 1 flush skipped and caused data loss.
> Bug 1: Wrong calculation of HRegion.memstoreSize
> When a flush fails, data to be flushed is kept in each MemStore's snapshot
> and wait for next flush attempt to continue on it. But when the next flush
> succeeds, the counter of total memstore size in HRegion is always deduced by
> the sum of current memstore sizes instead of snapshots left from previous
> failed flush. This calculation is problematic that almost every time there is
> failed flush, HRegion.memstoreSize gets reduced by a wrong value. If region
> flush could not proceed for a couple cycles, the size in current memstore
> could be much larger than the snapshot. It's likely to drift memstoreSize
> much smaller than expected. In extreme case, if the error accumulates to even
> bigger than HRegion's memstore size limit, any further flush is skipped
> because flush does not do anything if memstoreSize is not larger than 0.
> When the region is closing, if the two flushes get skipped and leave data in
> current memstore and/or snapshot, we could lose data up to the memstore size
> limit of the region.
> The fix is deducing correct size of data that is going to be flushed from
> memstoreSize.
> Bug 2: Conditions for the first flush of region close (so-called pre-flush)
> If memstoreSize is smaller than a certain value, or when region close starts
> a flush is ongoing, the first flush is skipped and only the second flush
> takes place. However, two flushes are required in case previous flush fails
> and leaves some data in snapshot. The bug could cause loss of data in current
> memstore.
> The fix is removing all conditions except abort check so we ensure 2 flushes
> for region close.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)