[ 
https://issues.apache.org/jira/browse/HBASE-10829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946911#comment-13946911
 ] 

Enis Soztutar commented on HBASE-10829:
---------------------------------------

Here is a log of events in case you are interested. The region 
(9514935e6a659bd90faa21bf458a842e) was happily hosted by some region server. 
After the writes have settled down, the region had some un-flushed data. The 
last flush happened, and after some time, the write tasks finished, so no more 
data was coming in for some time:
{code}
2014-03-24 20:54:30,924 INFO  [Thread-22] regionserver.HRegion: Finished 
memstore flush of ~128.2 M/134443296, currentsize=12.7 M/13270608 for region 
IntegrationTestBigLinkedList,\x07\xFE\xDA\x1Chv\xF9\x7F\x18s\xEE\x0C\x85X\xFCU,1395690539958.9514935e6a659bd90faa21bf458a842e.
 in 7324ms, sequenceid=119978, compaction requested=true
{code}

After some more time, the region decided to do a compaction. At this point no 
writes were coming.
{code}
compaction
2014-03-24 20:55:52,764 INFO  
[regionserver60020-smallCompactions-1395694311085] regionserver.HStore: 
Starting compaction of 5 file(s) in meta of 
IntegrationTestBigLinkedList,\x07\xFE\xDA\x1Chv\xF9\x7F\x18s\xEE\x0C\x85X\xFCU,1395690539958.9514935e6a659bd90faa21bf458a842e.
 into 
tmpdir=hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/.tmp,
 totalSize=212.2 M
{code}

After this compaction, but before any more flush, the region server got killed 
around: 
{code}
2014-03-24 20:56:44,466 DEBUG [regionserver60020-EventThread] 
regionserver.SplitLogWorker: tasks arrived or departed
{code}

Because of the region server got killed, the cluster performed a log split, 
which completed without any issues (logs are not necessary). This resulted in 7 
log files to be split, resulting in 7 files in recovered.edits under region 
dir. 

Then, some other region server opens the region and applies the recovered edits 
in memory:
{code}
Open region:
2014-03-24 20:57:28,196 DEBUG [StoreOpener-9514935e6a659bd90faa21bf458a842e-1] 
regionserver.HStore: loaded 
hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/meta/02f7152afee34b07b40fa31e0de5a3de,
 isReference=false, isBulkLoadResult=false, seqid=119978, majorCompaction=false
2014-03-24 20:57:28,240 DEBUG [StoreOpener-9514935e6a659bd90faa21bf458a842e-1] 
regionserver.HStore: loaded 
hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/meta/69260a6d4ffc45a1806dd501204b73ce,
 isReference=false, isBulkLoadResult=false, seqid=88532, majorCompaction=true

2014-03-24 20:57:28,264 INFO  [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Replaying edits from 
hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000118699
2014-03-24 20:57:28,457 DEBUG [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Applied 0, skipped 187830, firstSequenceidInLog=118084, 
maxSequenceidInLog=-1, 
path=hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000118699
2014-03-24 20:57:28,460 INFO  [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Replaying edits from 
hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000119351
2014-03-24 20:57:28,630 DEBUG [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Applied 0, skipped 199401, firstSequenceidInLog=118700, 
maxSequenceidInLog=-1, 
path=hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000119351
2014-03-24 20:57:28,632 INFO  [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Replaying edits from 
hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000120086
2014-03-24 20:57:28,873 DEBUG [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Applied 37938, skipped 148185, 
firstSequenceidInLog=119352, maxSequenceidInLog=120086, 
path=hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000120086
2014-03-24 20:57:28,876 INFO  [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Replaying edits from 
hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000120806
2014-03-24 20:57:30,130 DEBUG [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Applied 189648, skipped 0, firstSequenceidInLog=120087, 
maxSequenceidInLog=120806, 
path=hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000120806
2014-03-24 20:57:30,133 INFO  [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Replaying edits from 
hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000121436
2014-03-24 20:57:31,622 DEBUG [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Applied 187988, skipped 0, firstSequenceidInLog=120807, 
maxSequenceidInLog=121436, 
path=hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000121436
2014-03-24 20:57:31,625 INFO  [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Replaying edits from 
hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000122408
2014-03-24 20:57:33,850 DEBUG [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Applied 206643, skipped 0, firstSequenceidInLog=121437, 
maxSequenceidInLog=122408, 
path=hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000122408
2014-03-24 20:57:33,860 INFO  [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Replaying edits from 
hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000122409
2014-03-24 20:57:33,864 DEBUG [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HStore: Completing compaction from the WAL marker  ====> we replay 
the compaction marker
2014-03-24 20:57:33,866 DEBUG [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Applied 0, skipped 1, firstSequenceidInLog=122409, 
maxSequenceidInLog=-1,  ====> contains only compaction record 
path=hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000122409
{code}
As per above, some of the edits are skipped, and the rest is applied to 
memstore. The latest flush file had seqId=119978, and we were able to replay up 
to seqId=122409. However notice that the final log file contains ONLY a 
compaction record! This record has the seqId, but we still return -1 to the 
caller. This results in the region skipping the mandatory memstore flush BEFORE 
opening the region for writes and deleting the log files: 
{code}
try {
  seqid = replayRecoveredEdits(edits, maxSeqIdInStores, reporter);
} catch (IOException e) {
}
if (seqid > minSeqIdForTheRegion) {
  // Then we added some edits to memory. Flush and cleanup split edit files.
  internalFlushcache(null, seqid, status);
}
{code}

Now, the region is opened without the flush, and we also delete the 
recovered.edit logs thinking that we made the edits durable: 
{code}
2014-03-24 20:57:33,893 DEBUG [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Deleted recovered.edits 
file=hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000118699
2014-03-24 20:57:33,983 DEBUG [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Deleted recovered.edits 
file=hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/9514935e6a659bd90faa21bf458a842e/recovered.edits/0000000000000119351
2014-03-24 20:57:34,110 INFO  [RS_OPEN_REGION-hor9n08:60020-2] 
regionserver.HRegion: Onlined 9514935e6a659bd90faa21bf458a842e; next 
sequenceid=119979  ====> NO FLUSH
{code}
Notice the lack of flush above. Since the CM is still running, now this region 
server is killed before a flush happens in this region, resulting in data loss 
on the in memory data. Since recovered.edits files are also deleted, there is 
no recovery back. 

Thanks to [~devaraj] for pair-debugging this issue. 

> Flush is skipped after log replay if the last recovered edits file is skipped
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-10829
>                 URL: https://issues.apache.org/jira/browse/HBASE-10829
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>            Priority: Critical
>             Fix For: 0.99.0, 0.98.2, 0.96.3
>
>
> We caught this in an extended test run where IntegrationTestBigLinkedList 
> failed with some missing keys. 
> The problem is that HRegion.replayRecoveredEdits() would return -1 if all the 
> edits in the log file is skipped, which is true for example if the log file 
> only contains a single compaction record (HBASE-2231) or somehow the edits 
> cannot be applied (column family deleted, etc). 
> The callee, HRegion.replayRecoveredEditsIfAny() only looks for the last 
> returned seqId to decide whether a flush is necessary or not before opening 
> the region, and discarding replayed recovered edits files. 
> Therefore, if the last recovered edits file is skipped but some edits from 
> earlier recovered edits files are applied, the mandatory flush before opening 
> the region is skipped. If the region server dies after this point before a 
> flush, the edits are lost. 
> This is important to fix, though the sequence of events are super rare for a 
> production cluster. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to