[ 
https://issues.apache.org/jira/browse/HBASE-6230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410682#comment-13410682
 ] 

Jesse Yates commented on HBASE-6230:
------------------------------------

{quote}
 Restore Table

Given a "snapshot name" restore override the original table with the snapshot 
content.
Before restoring a new snapshot of the table is taken, just to avoid bad 
situations.
(If the table is not disabled we can keep serving reads)

This allows a full and quick rollback to a previous snapshot.
{quote}

+1 on the general design.

How does this correspond to restoring a table from a snapshot when the table 
doesn't exist? I feel like this should be a semantically different use case, 
though the underlying implementation will probably only differ in terms of not 
taking a snapshot of the existing table because no existing table can exist. 
I'd propose that Restore -> Rollback and Restore then means just taking a 
snapshot and creating a table from it. This means on the external cluster, the 
exported snapshot is then 'restored' on the remote cluster.

{quote}
Clone Snapshot
{quote}

This could be very, very tricky in terms of multiple tables reading the same 
files. You would have to make sure that no other tables are using the current 
HFiles when a compaction comes around. Otherwise, when you archive the files, 
you will break the other table using those files. Maybe there is some niceness 
in HDFS that will blowup on you when trying to move a file someone else is 
currently reading, but that would take some investigation. I have a feeling 
there is also a bunch of code that assumes a certain layout for the files that 
will make this hard. I'm not saying its not doable, but its not going to be 
trivial.

{quote}
* To Restore only "individual items" (only some small range of data was lost 
from "current")
** MR job that scan the cloned table and update the data in the original one. 
(Partial restore of the data)
{quote}

This seems like  slightly more difficult proposal. I'm not adverse to doing 
this, but it isn't a trivial operation and probably should be taken care of by 
a Map/Reduce job that exports to a 'small' (depending on data-size), temporary 
table so we can easily filter out the right ranges without having to stand up a 
special region or do a ton of compactions. This means it becomes an inherently 
slower operation, but should be performant enough for recovering data and makes 
lots of sense to recovering a very large chunk in terms of overall throughput 
(though you probably want to just restore a clone at that point).

This brings up another potential nicety  - a snapshot and clone operation. 
Takes a snapshot of the existing table and then stands up a clone of that data. 
Small addition to the interface and to me what a real 'clone' operation should 
do.

{quote}
Export Snapshot
{quote}
+1 Let the remote cluster restore the snapshot if they want to do it - don't 
force a table to be stood up immediately.
                
> [brainstorm] "Restore" snapshots for HBase 0.96
> -----------------------------------------------
>
>                 Key: HBASE-6230
>                 URL: https://issues.apache.org/jira/browse/HBASE-6230
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Jesse Yates
>            Assignee: Matteo Bertozzi
>
> Discussion ticket around the definitions/expectations of different parts of 
> snapshot restoration.  This is complementary, but separate from the _how_ of 
> taking a snapshot of a table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to