[
https://issues.apache.org/jira/browse/HBASE-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280731#comment-13280731
]
chunhui shen commented on HBASE-6065:
-------------------------------------
Suppose region A on the regionserver B,
The issue could reproduce as the following step:
1.put one data to region A (append seq 1 in the hlog)
2.put one data to region A (append seq 2 in the hlog)
3.region A start flush, it will call HLog#startCacheFlush (current seq num is
3 in the hlog)
4.put one data to region A (append seq 4 in the hlog)
5.region A complete flush, it will call HLog#completeCacheFlush (append seq 3
in the hlog)
6.kill regionserver B.
So, the hlog file has four edit:
seq 1
seq 2
seq 4
seq 3
when splitting this hlog file, we generate the recoverd.edits file for region A
which is named 3.(About the name, we could see HLogSplitter#splitLogFileToTemp)
Now, when replaying recoverd.edits file for region A, we will skip this file
and cause data loss.
> Log for flush would append a non-sequential edit in the hlog, may cause data
> loss
> ---------------------------------------------------------------------------------
>
> Key: HBASE-6065
> URL: https://issues.apache.org/jira/browse/HBASE-6065
> Project: HBase
> Issue Type: Bug
> Components: wal
> Reporter: chunhui shen
> Assignee: chunhui shen
> Attachments: HBASE-6065.patch
>
>
> After completing flush region, we will append a log edit in the hlog file
> through HLog#completeCacheFlush.
> {code}
> public void completeCacheFlush(final byte [] encodedRegionName,
> final byte [] tableName, final long logSeqId, final boolean
> isMetaRegion)
> {
> ...
> HLogKey key = makeKey(encodedRegionName, tableName, logSeqId,
> System.currentTimeMillis(), HConstants.DEFAULT_CLUSTER_ID);
> ...
> }
> {code}
> when we make the hlog key, we use the seqId from the parameter, and it is
> generated by HLog#startCacheFlush,
> Here, we may append a lower seq id edit than the last edit in the hlog file.
> If it is the last edit log in the file, it may cause data loss.
> because
> {code}
> HRegion#replayRecoveredEditsIfAny{
> ...
> maxSeqId = Math.abs(Long.parseLong(fileName));
> if (maxSeqId <= minSeqId) {
> String msg = "Maximum sequenceid for this log is " + maxSeqId
> + " and minimum sequenceid for the region is " + minSeqId
> + ", skipped the whole file, path=" + edits;
> LOG.debug(msg);
> continue;
> }
> ...
> }
> {code}
> We may skip the splitted log file, because we use the lase edit's seq id as
> its file name, and consider this seqId as the max seq id in this log file.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira