[jira] [Comment Edited] (HBASE-14949) Skip duplicate entries when replay WAL.

Heng Chen (JIRA) Wed, 09 Dec 2015 23:53:45 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050250#comment-15050250
 ]


Heng Chen edited comment on HBASE-14949 at 12/10/15 7:52 AM:
-------------------------------------------------------------

I check current logic and found that we need to do nothing......

It has already skip the duplicate entries during split WAL into recovery region 
edits.   
And WAL named by timestamp when it is generated, so there is no need to use 
another format name.

relates code 

{code: title=WALSplitter#splitLogFile}
352  if (lastFlushedSequenceId >= entry.getKey().getLogSeqNum()) {
353          editsSkipped++;
354          continue;
355  }
{code}

I think we can invalid this issue. 


UPDATE:

Sorry,  the lastFlushedSequenceId is flushed id from HFile,  Maybe we could do 
something to skip duplicate entries in WAL through the same way


was (Author: chenheng):
I check current logic and found that we need to do nothing......

It has already skip the duplicate entries during split WAL into recovery region 
edits.   
And WAL named by timestamp when it is generated, so there is no need to use 
another format name.

relates code 

{code: title=WALSplitter#splitLogFile}
352  if (lastFlushedSequenceId >= entry.getKey().getLogSeqNum()) {
353          editsSkipped++;
354          continue;
355  }
{code}

I think we can invalid this issue. 


UPDATE:

Sorry,  the lastFlushedSequenceId is flushed id from HFile,  Maybe we could do 
something to skip duplicate entries in WAL

> Skip duplicate entries when replay WAL.
> ---------------------------------------
>
>                 Key: HBASE-14949
>                 URL: https://issues.apache.org/jira/browse/HBASE-14949
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Heng Chen
>         Attachments: HBASE-14949.patch
>
>
> As HBASE-14004 design,  there will be duplicate entries in different WAL.  It 
> happens when one hflush failed, we will close old WAL with 'acked hflushed' 
> length,  then open a new WAL and write the unacked hlushed entries into it.
> So there maybe some overlap between old WAL and new WAL.
> We should skip the duplicate entries when replay.  I think it has no harm to 
> current logic, maybe we do it first. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-14949) Skip duplicate entries when replay WAL.

Reply via email to