[ 
https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853473#action_12853473
 ] 

Todd Lipcon commented on HBASE-50:
----------------------------------

I think the general design sounds good, but there are a few open questions (the 
things that make this issue both very hard and very fun!):

- How do we make snapshot creation very low impact on the cluster?
- What happens if the snapshot is initiated during a transition? eg a region is 
in the middle of a split or recovery?
- How do we do the reference counting in an efficient way?
- If old files are moved aside after a compaction, how do we deal with 
concurrent readers of the snapshot?

I think there are a couple tasks that could come before snapshots as concrete 
steps along the way:
- Ensure that enough data is present in HFiles themselves so that if all 
metadata is lost, and there are extra non-GCed store files around, we can still 
reconstruct the correct table state.
- Change all data deletions to be garbage collected by the master process 
instead of by the region server in transition
- Add reference counting, perhaps via ZK, to the ondisk files, so they aren't 
GCed if they're in use by some snapshot. This would also enable more safe 
distcp backups.
- Actual snapshot trigger feature, management tools, etc

> Snapshot of table
> -----------------
>
>                 Key: HBASE-50
>                 URL: https://issues.apache.org/jira/browse/HBASE-50
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Billy Pearson
>            Priority: Minor
>
> Havening an option to take a snapshot of a table would be vary useful in 
> production.
> What I would like to see this option do is do a merge of all the data into 
> one or more files stored in the same folder on the dfs. This way we could 
> save data in case of a software bug in hadoop or user code. 
> The other advantage would be to be able to export a table to multi locations. 
> Say I had a read_only table that must be online. I could take a snapshot of 
> it when needed and export it to a separate data center and have it loaded 
> there and then i would have it online at multi data centers for load 
> balancing and failover.
> I understand that hadoop takes the need out of havening backup to protect 
> from failed servers, but this does not protect use from software bugs that 
> might delete or alter data in ways we did not plan. We should have a way we 
> can roll back a dataset.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to