[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15449794#comment-15449794 ]
Enis Soztutar commented on HBASE-14417: --------------------------------------- We should use a similar design for this issue with "HFile replication" via HBASE-13153. Replication of hfiles is conceptually very similar to incremental bulk load, and we already have the tools. Please read the design doc and corresponding code. For this issue, we can mainly do this: - Similar to replication, everytime BL happens, we create a reference per incremental backup as a part of the BL process. These references can be saved in the hbase:backup table. - A custom hfile cleaner (like the ReplicationHFileCleaner) will monitor the bulk loaded hfiles, and makes sure that they are not deleted even after compactions, etc if there is still incremental backup to be performed. - Incremental backup will also copy the files that are from BL by referring to the references saved in hbase:backup table in the next round. > Incremental backup and bulk loading > ----------------------------------- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature > Affects Versions: 2.0.0 > Reporter: Vladimir Rodionov > Assignee: Vladimir Rodionov > Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)