we are working on a backup/restore solution in https://issues.apache.org/jira/browse/HBASE-7912, which will use snapshot and exportsnapshot for full backup and also use WALPlayer for incremental backup. the patches are coming.
For critical data, real time replication is the way to go : https://hbase.apache.org/replication.html. The drop-back of replication is that when the master cluster get hot with a lot of put/update/delete, the replication will also consume the resource. And backup/restore can be scheduled at a 'quiet' time. Well, there is always trade-off. demai On Wed, Apr 2, 2014 at 7:28 AM, Ted Yu <[email protected]> wrote: > Modification to CLONE_TEST wouldn't affect original snapshot. > > Cheers > > > On Wed, Apr 2, 2014 at 6:56 AM, R W <[email protected]> wrote: > > > Hi Ted > > > > OK, i guess i know how it works, so when i execute the clone operation, > > data for the new table will be copied from the snapshot, so if my new > table > > is called "CLONE_TEST", i think on hdfs it will have a path like this > > /hbase/CLONE_TEST which has the copied the data, then further > modification > > to CLONE_TEST table has nothing to do with the original snapshot, am i > > correct? Thanks for your quick response :) > > > > Cheers > > aij > > > > > > On Wed, Apr 2, 2014 at 9:47 PM, Ted Yu <[email protected]> wrote: > > > > > For first question about clone from snapshot, there is no copy of > > snapshot > > > involved. > > > The clone is made from the snapshot itself. > > > > > > Cheers > > > > > > On Apr 2, 2014, at 4:23 AM, R W <[email protected]> wrote: > > > > > > > Hi Esteban > > > > > > > > I checked the snapshot feature and tried myself, it's very good, one > of > > > the > > > > introduction > > > > > > > > > > http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/mentioned > > > > about: > > > > > > > > Clone a snapshot: This operation creates a new table using the same > > > schema > > > >> and with the same data present in the specified snapshot. The result > > of > > > >> this operation is a new fully functional table that can can be > > modified > > > >> with no impact on the original table or the snapshot. > > > > > > > > > > > > I think this clone operation will make a copy of the snapshot, then > > > create > > > > the new table from the copy of the snapshot, am i correct? Otherwise, > > > > modification to the new table will change the snapshot, right? > > > > > > > > Another question, if we want to backup hbase data somewhere else, it > > > seems > > > > we cannot go with snapshot feature, we want the data to be backup > even > > > > after the whole Hadoop cluster down, any idea? > > > > > > > > Thanks > > > > aij > > > > > > > > > > > > On Wed, Apr 2, 2014 at 2:12 PM, Esteban Gutierrez < > > [email protected] > > > >wrote: > > > > > > > >> Hello Aij, > > > >> > > > >> Snapshots are the suggested method since HBase 0.94.6, they provide > > > better > > > >> consistency for backing up data in HBase. You can find more > > information > > > in > > > >> the HBase Book here: > > > >> > > > >> https://hbase.apache.org/book.html#ops.snapshots > > > >> > > > >> Depending on your use case and resources you might want to consider > > > >> replication as well: > > > >> > > > >> http://hbase.apache.org/replication.html > > > >> > > > >> cheers, > > > >> esteban. > > > >> > > > >> > > > >> -- > > > >> Cloudera, Inc. > > > >> > > > >> > > > >> > > > >> On Tue, Apr 1, 2014 at 10:56 PM, R W <[email protected]> wrote: > > > >> > > > >>> Hi Guys > > > >>> > > > >>> I'm using hbase org.apache.hadoop.hbase.mapreduce.Export > > > >>> / org.apache.hadoop.hbase.mapreduce.Import to backup and restore > > HBase > > > >>> data, at least it's good to me, i would like to know if there are > any > > > >>> better solutions or practices on how to backup HBase data, that > will > > be > > > >>> really helpful for us, thanks. > > > >>> > > > >>> Cheers > > > >>> aij > > > >> > > > > > >
