[jira] Commented: (HBASE-50) Snapshot of table

Alex Newman (JIRA) Mon, 16 Nov 2009 21:08:04 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778724#action_12778724
 ]


Alex Newman commented on HBASE-50:
----------------------------------

Say you flushed the logs and then ran a compaction and waited for the cluster 
to chill out. Unless you had extremely  high churn rates I would suggest: A 
mapreduce with a region per task which fails and retries in case of flushes or 
compactions. In the case of a split you can fail the job, disable splitting, or 
have some way of getting the children later.

Even though you would have kindof data throughout a time period, you would at 
least have a timestamp of when that backup was made. I.E. consistency on a 
region level which is all a lot of us really want.

> Snapshot of table
> -----------------
>
>                 Key: HBASE-50
>                 URL: https://issues.apache.org/jira/browse/HBASE-50
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Billy Pearson
>            Assignee: Alex Newman
>            Priority: Minor
>
> Havening an option to take a snapshot of a table would be vary useful in 
> production.
> What I would like to see this option do is do a merge of all the data into 
> one or more files stored in the same folder on the dfs. This way we could 
> save data in case of a software bug in hadoop or user code. 
> The other advantage would be to be able to export a table to multi locations. 
> Say I had a read_only table that must be online. I could take a snapshot of 
> it when needed and export it to a separate data center and have it loaded 
> there and then i would have it online at multi data centers for load 
> balancing and failover.
> I understand that hadoop takes the need out of havening backup to protect 
> from failed servers, but this does not protect use from software bugs that 
> might delete or alter data in ways we did not plan. We should have a way we 
> can roll back a dataset.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-50) Snapshot of table

Reply via email to