[ 
https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15449794#comment-15449794
 ] 

Enis Soztutar commented on HBASE-14417:
---------------------------------------

We should use a similar design for this issue with "HFile replication" via 
HBASE-13153. Replication of hfiles is conceptually very similar to incremental 
bulk load, and we already have the tools. Please read the design doc and 
corresponding code. 

For this issue, we can mainly do this: 
 - Similar to replication, everytime BL happens, we create a reference per 
incremental backup as a part of the BL process. These references can be saved 
in the hbase:backup table. 
 - A custom hfile cleaner (like the ReplicationHFileCleaner) will monitor the 
bulk loaded hfiles, and makes sure that they are not deleted even after 
compactions, etc if there is still incremental backup to be performed.  
 - Incremental backup will also copy the files that are from BL by referring to 
the references saved in hbase:backup table in the next round. 

> Incremental backup and bulk loading
> -----------------------------------
>
>                 Key: HBASE-14417
>                 URL: https://issues.apache.org/jira/browse/HBASE-14417
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 2.0.0
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>            Priority: Critical
>              Labels: backup
>             Fix For: 2.0.0
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading 
> bypasses WALs for obvious reasons, breaking incremental backups. The only way 
> to continue backups after bulk loading is to create new full backup of a 
> table. This may not be feasible for customers who do bulk loading regularly 
> (say, every day).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to