[ 
https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563463#comment-15563463
 ] 

Ted Yu commented on HBASE-14417:
--------------------------------

While working on BackupHFileCleaner, the counterpart to 
ReplicationHFileCleaner, I notice the potential impact on the server hosting 
hbase:backup because we need to have up-to-date information on the hfiles which 
are still referenced by the incremental backup.

One potential approach is to store hfile information in zookeeper.
This would also alleviate the issue mentioned above about reducing writing 
BulkLoadDescriptor's to hbase:backup table.

Any suggestions, [~mbertozzi] [~vrodionov] ?

> Incremental backup and bulk loading
> -----------------------------------
>
>                 Key: HBASE-14417
>                 URL: https://issues.apache.org/jira/browse/HBASE-14417
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 2.0.0
>            Reporter: Vladimir Rodionov
>            Assignee: Ted Yu
>            Priority: Critical
>              Labels: backup
>             Fix For: 2.0.0
>
>         Attachments: 14417.v1.txt, 14417.v2.txt, 14417.v6.txt
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading 
> bypasses WALs for obvious reasons, breaking incremental backups. The only way 
> to continue backups after bulk loading is to create new full backup of a 
> table. This may not be feasible for customers who do bulk loading regularly 
> (say, every day).
> Google doc for design:
> https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to