[ 
https://issues.apache.org/jira/browse/HBASE-14142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15319245#comment-15319245
 ] 

Enis Soztutar commented on HBASE-14142:
---------------------------------------

Duplicate KVs having the same seqId is not an issue. We have the same behavior 
in replication (where replication can end up writing more than once). 
Increment and Append are user-level operations. The WAL and HFile never contain 
non-idempotent operations like Increment. Both does a get + put, and the Put is 
written to the WAL, not an "Increment" operation. 

> HBase Backup/Restore Phase 3: Edits deduplication during backup
> ---------------------------------------------------------------
>
>                 Key: HBASE-14142
>                 URL: https://issues.apache.org/jira/browse/HBASE-14142
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>
> As since we do not record last backed up sequence ids (MVCC) and do not 
> restore up to that sequence id - that is kind of tricky, there will be some 
> duplicates of KVs in store files after first incremental restore after full 
> backup. These duplicates are result of how we do full backup and first 
> incremental backup after full one. During full backup we perform distributed 
> log roll and record, for every RS, last WAL timestamp, then we do snapshot. 
> The next WAL after recorded one will make it into a next incremental backup 
> set, but it will contains some edits (puts, deletes) which have been recorded 
> by a previous snapshot. During restore, we, first, restore snapshot, then we 
> will re-play WALs and this operation can create some duplicates of KVs in 
> different store files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to