[jira] Commented: (HBASE-50) Snapshot of table

HBase Review Board (JIRA) Tue, 10 Aug 2010 22:13:48 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897143#action_12897143
 ]

HBase Review Board commented on HBASE-50:
-----------------------------------------

Message from: "Chongxin Li" <[email protected]>

bq.  On 2010-08-10 10:49:06, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/HSnapshotDescriptor.java, line 36
bq.  > <http://review.cloudera.org/r/467/diff/3/?file=6001#file6001line36>
bq.  >
bq.  >     Drop the H.  Call it SnapshotDescriptor

Alright

bq.  On 2010-08-10 10:49:06, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/HSnapshotDescriptor.java, line 41
bq.  > <http://review.cloudera.org/r/467/diff/3/?file=6001#file6001line41>
bq.  >
bq.  >     If it is in under the snapshot directory maybe just call this file 
snapshotinfo? Drop the '.' prefix.  The '.' prefix is usually to demark 
'special' files we don't want to consider as part of normal operation.  In this 
case, we are under a snapshot directory, already outside of 'normal' operation.

This is named following .regioninfo

bq.  On 2010-08-10 10:49:06, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/HRegionInfo.java, line 373
bq.  > <http://review.cloudera.org/r/467/diff/3/?file=6000#file6000line373>
bq.  >
bq.  >     How often is this called?  If it happens alot, it could add up -- be 
expensive.

Not too much actually. This method is only called in BaseScanner when reference 
rows in META are checked and synchronized with the reference files. And right 
now there would be at most five rows to be checked in one scan of META.
There is no region info saved in each reference row. Thus reference row which 
is a combination of SNAPSHOT_PREFIX and region name is parsed to obtain the 
region name. That's why we need this method.

- Chongxin

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/467/#review800
-----------------------------------------------------------

> Snapshot of table
> -----------------
>
>                 Key: HBASE-50
>                 URL: https://issues.apache.org/jira/browse/HBASE-50
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Billy Pearson
>            Assignee: Li Chongxin
>            Priority: Minor
>         Attachments: HBase Snapshot Design Report V2.pdf, HBase Snapshot 
> Design Report V3.pdf, HBase Snapshot Implementation Plan.pdf, Snapshot Class 
> Diagram.png
>
>
> Havening an option to take a snapshot of a table would be vary useful in 
> production.
> What I would like to see this option do is do a merge of all the data into 
> one or more files stored in the same folder on the dfs. This way we could 
> save data in case of a software bug in hadoop or user code. 
> The other advantage would be to be able to export a table to multi locations. 
> Say I had a read_only table that must be online. I could take a snapshot of 
> it when needed and export it to a separate data center and have it loaded 
> there and then i would have it online at multi data centers for load 
> balancing and failover.
> I understand that hadoop takes the need out of havening backup to protect 
> from failed servers, but this does not protect use from software bugs that 
> might delete or alter data in ways we did not plan. We should have a way we 
> can roll back a dataset.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-50) Snapshot of table

Reply via email to