[
https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288238#comment-13288238
]
Jonathan Hsieh commented on HBASE-6055:
---------------------------------------
Jesse,
Thanks for answering the questions. A strong +1 for doing the simplest hbase
timestamp-based approach first, and then looking into the more complicated
version as an option afterwards. Maybe start a sub issue with the
point-in-time approach to move discussion there? (I still have questions there,
might be better to ask there)
The main use case I care about is ability to quickly "snapshot" without
downtime and quickly recover it (ideally with no downtime, but possibly with a
short downtime window). Although it is a "sloppy snapshot" conceptually it is
pretty simple to define and I think the caveats are fairly well undestood. I
don't expect something with stronger consistency guarantees than what hbase
currently offers but do expect something better (cheaper/faster) than the
current closest thing which is a CopyTable.
I have a bunch of new questions - some just asking for precision and some for
clarification. It might be helpful to define terms in the beginning of the doc
so it stays consistent?
- Hm.. how do you restore a snapshot from references files if it hasn't been
scan/copied yet? Require scan/copy "materialization" of the snapshot first?
(which means slower restore, but probably would likely be simplest for a first
cut)
- Snapshot restore needs to be "transactional" like snapshotting right?
- what is "export"? is this taking a snapshot or the materialization or the
snapshot restore or something else?
- If we restore snapshots to the same hbase instance, in dir structure, you
probably need .regioninfo files as well. (contains region startkey/endkey info
necessary to reconsistute META later).
- Is restoring to a separate instance in scope? If so bulk loads can be
expensive -- if regions don't line up there will be a bunch of spliting that
happens. Again, keeping the regionsinfos and the snapshot's splits may be
worthwhile.
- Where do the materialized versions of the snapshot reference files end up?
in the snapshot dirs? elsewhere?
-- This potentially gets a little trickier with markers as opposed to log rolls.
-- The HLog will have edits from regions not relevant to the table's regions.
Not a huge problem but maybe an optmization would be that the materialization
step will do an "offline hlogsplit/flush" to just keep the data relevent to
this table/region?
> Snapshots in HBase 0.96
> -----------------------
>
> Key: HBASE-6055
> URL: https://issues.apache.org/jira/browse/HBASE-6055
> Project: HBase
> Issue Type: New Feature
> Components: client, master, regionserver, zookeeper
> Reporter: Jesse Yates
> Assignee: Jesse Yates
> Fix For: 0.96.0
>
> Attachments: Snapshots in HBase.docx
>
>
> Continuation of HBASE-50 for the current trunk. Since the implementation has
> drastically changed, opening as a new ticket.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira