[
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-7912:
-------------------------
Issue Type: Sub-task (was: New Feature)
Parent: HBASE-10856
> HBase Backup/Restore Based on HBase Snapshot and FileLink
> ---------------------------------------------------------
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
> Issue Type: Sub-task
> Reporter: Richard Ding
> Assignee: Richard Ding
>
> There have been attempts in the past to come up with a viable HBase
> backup/restore solution (e.g., HBASE-4618). Recently, there are many
> advancements and new features in HBase, for example, FileLink, Snapshot, and
> Distributed Barrier Procedure. This is a proposal for a backup/restore
> solution that utilizes these new features to achieve better performance and
> consistency.
>
> A common practice of backup and restore in database is to first take full
> baseline backup, and then periodically take incremental backup that capture
> the changes since the full baseline backup. HBase cluster can store massive
> amount data. Combination of full backups with incremental backups has
> tremendous benefit for HBase as well. The following is a typical scenario
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase.
> # The user schedules periodical incremental backups to capture the changes
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).
> Then the incremental backups that are up to the desired point in time are
> applied on top of the full backup.
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental changes, but we use bulk load of
> HFiles for fast incremental restore.
> * Support single table or a set of tables, and column family level backup and
> restore.
> * Restore to different table names.
> * Support adding additional tables or CF to backup set without interruption
> of incremental backup schedule.
> * Support rollup/combining of incremental backups into longer period and
> bigger incremental backups.
> * Unified command line interface for all the above.
> The solution will support HBase backup to FileSystem, either on the same
> cluster or across clusters. It has the flexibility to support backup to
> other devices and servers in the future.
--
This message was sent by Atlassian JIRA
(v6.2#6252)