[ 
https://issues.apache.org/jira/browse/HBASE-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987083#action_12987083
 ] 

Kannan Muthukkaruppan commented on HBASE-3481:
----------------------------------------------

Dug some more... the bad MAX_SEQ_ID that the HFile got stamped with corresponds 
to an edit that looks special.

{code}
<METAROW/METAFAMILY:/1296023308559/Put/vlen=17; >
{code}

Another interesting thing is that in the  .logs, this seq id  appears out of 
order...

{code}
...
#1081, pos=6403894 
kannan1/kannan1,,1296023211936.dfe45b39b1d8c360b8e7f554b576ab17./71170147=[#edits:
 18 = 
<0992bdb67af8d2b2172618496243433c:287686/actions:0\/1296023308555/Put/vlen=597; 
0992bdb67af8d2b2172618496243433c:287686/actions:1/1296023308555/Put/vlen=416; 
0992bdb67af8d2b2172618496243433c:287686/actions:2\/1296023308555/Put/vlen=436; 
0992bdb67af8d2b2172618496243433c:287686/actions:3/1296023308555/Put/vlen=329; 
0992bdb67af8d2b2172618496243433c:287686/actions:4\/1296023308555/Put/vlen=256; 
0992bdb67af8d2b2172618496243433c:287686/actions:5/1296023308555/Put/vlen=792; 
0992bdb67af8d2b2172618496243433c:287686/actions:6\/1296023308555/Put/vlen=231; 
0992bdb67af8d2b2172618496243433c:287686/actions:7/1296023308555/Put/vlen=790; 
0992bdb67af8d2b2172618496243433c:287686/actions:8\/1296023308555/Put/vlen=800; 
0992bdb67af8d2b2172618496243433c:287686/actions:9/1296023308555/Put/vlen=664; 
0992bdb67af8d2b2172618496243433c:287686/actions:10\/1296023308555/Put/vlen=618; 
0992bdb67af8d2b2172618496243433c:287686/actions:11/1296023308555/Put/vlen=533; 
0992bdb67af8d2b2172618496243433c:287686/actions:12\/1296023308555/Put/vlen=636; 
0992bdb67af8d2b2172618496243433c:287686/actions:13/1296023308555/Put/vlen=554; 
0992bdb67af8d2b2172618496243433c:287686/actions:14\/1296023308555/Put/vlen=353; 
0992bdb67af8d2b2172618496243433c:287686/actions:15/1296023308555/Put/vlen=307; 
0992bdb67af8d2b2172618496243433c:287686/actions:16\/1296023308555/Put/vlen=410; 
0992bdb67af8d2b2172618496243433c:287686/actions:17/1296023308555/Put/vlen=579; 
>]

#1082, pos=6404082 
kannan1/kannan1,bbbbbbbb,1296023211937.0a3f2362b0264b0192c311f10c006d84./71169066{=[#edits:
 1 = <METAROW/METAFAMILY:/1296023308559/Put/vlen=17; >]

#1083, pos=6406845 
kannan1/kannan1,99999999,1296023211936.8ff83822490b1f8175dc16aaf4c170a6./71170148=[#edits:
 4 = 
<a2adc743be4a350b7009af5143acfbf3:287685/actions:0/1296023308555/Put/vlen=552; 
a2adc743be4a35\
0b7009af5143acfbf3:287685/actions:1/1296023308555/Put/vlen=715; 
a2adc743be4a350b7009af5143acfbf3:287685/actions:2/1296023308555/Put/vlen=806; 
a2adc743be4a350b7009af5143acfbf3:287685/actions:3/1296023308555/P\
ut/vlen=296; >]
{code}


The flushed file's MAX_SEQ_ID is recorded as: 71169066 (corresponding to entry 
#1082 above). However that edit is not a regular put  key, but this special row 
<METAROW/METAFAMILY:/1296023308559/Put/vlen=17; >]

Also, notice out the sequence ids for edits surrounding it(entries #1081 & 
#1083) are both higher (namely, 71170147 & 71170148).

Looks like this sequence id is being obtained outside of the updatesLock. But 
haven't confirmed that in the code.


> max seq id in flushed file can be larger than its correct value causing data 
> loss during recovery
> -------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3481
>                 URL: https://issues.apache.org/jira/browse/HBASE-3481
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Priority: Critical
>
> [While doing some cluster kill tests, I noticed some missing data after log 
> recovery. Upon investigating further, and pretty printing contents of HFiles 
> and recovered logs, this is my analysis of the situation/bug. Please confirm 
> the theory and pitch in with suggestions.]
> When memstores are flushed, the max sequence id recorded in  the HFile should 
> be the max sequence id of all KVs in the memstore. However, we seem to simply 
> obtain the current sequence id from the HRegion, and stamp the HFile's 
> MAX_SEQ_ID with it.
> From HRegion.java:
> {code}
>     sequenceId = (wal == null)? myseqid: wal.startCacheFlush();
> {code}
> where, startCacheFlush() is:
> {code}
> public long startCacheFlush() {
>     this.cacheFlushLock.lock();
>     return obtainSeqNum();
>  }
> {code}
> where, obtainSeqNum() is simply: 
> {code}
> private long obtainSeqNum() {
>     return this.logSeqNum.incrementAndGet();
>   }
> {code}
> So let's say a memstore contains edits with sequence number 1..10.
> Meanwhile, say more Puts come along, and are going through this flow (in 
> pseudo-code)
> {code}
> 1. HLog.append();
>        1.1  obtainSeqNum()
>        1.2 writeToWAL()
> 2 updateMemStore()
> {code}
> So it is possible that the sequence number has already been incremented to 
> say 15 if there are 5 more outstanding puts. Say the writeToWAL() is still in 
> progress for these puts. In this case, none of these edits (11..15) would 
> have been written to memstore yet.
> At this point if a cache flush of the memstore happens, then we'll record its 
> MAX_SEQ_ID as 16 in the store file instead of 10 (because that's what 
> obtainSeqNum() would return as the next sequence number to use, right?).
> Assume that the edits 11..15 eventually complete. And so HLogs do contain the 
> data for edits 11..15.
> Now, at this point if the region server were to crash, and we run log 
> recovery, the splits all go through correctly, and a correct recovered.edits 
> file is generated with the edits 11..15. 
> Next, when the region is opened, the HRegion notes that one of the store file 
> says MAX_SEQ_ID is 16. So, when it replays the recovered.edits file, it  
> skips replaying edits 11..15. Or in other words, data loss.
> ----

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to