Hi, Could you describe a bit more what do you need by a backup, what do you expect by it and what is your data flow?
A snapshot at the moment can guarantee just row level consistency. This means that if you have in-flight writes some can be present in the snapshot some not. so, The best type of workload is the one that import data from somewhere. In this case after a restore you can check which keys are present and reimport the ones missing. The other case is the one where you don't care if some rows are missing. Snapshots doesn't create a copy of the data, so if you want to properly save your data somewhere else you have to use ExportSnapshot, to copy the data to another cluster. The main difference with CopyTable is that you don't impact the RS during the export, since the operation is just a filesystem level. I think that snapshots are really good for testing stuff, you can take a snapshot of a table.. clone a table from the snapshot and try to change compression, schema or just play with the data, without impacting the main table, and without have to copy petabyte of data. Here there's a snapshot related blog post that tries to explain how the feature work and what are some of the use cases https://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/ Matteo On Thu, May 9, 2013 at 9:11 AM, Samir Ahmic <[email protected]> wrote: > Hi all, > > We are using hbase-0.94.6.1 and at moment i'm evaluation Snapshots as > backup solution for moving data between clusters. I'm wondering if someone > have similar experience and what are pros and cons ? Also is Snapshot > future stable enough for this sort of operation ? > > Thanks, > Samir >
