On Thu, Mar 10, 2011 at 11:33 AM, Otis Gospodnetic <[email protected]> wrote: > 1) make a export/backup of 1 table at a time using > org.apache.hadoop.hbase.mapreduce.Export from HBASE-1684
This is actually checked in. See: ./bin/hadoop jar hbase-0.X.X.jar > 2) copy 1 table at a time using > http://hbase.apache.org/docs/current/api/org/apache/hadoop/hbase/mapreduce/CopyTable.html > > 3) use distcp to copy the whole /hbase part of HDFS > 4) replicate the whole cluster - http://hbase.apache.org/replication.html > 5) count on HDFS replication and live without the standard backup > > > What I'm not sure about is the following: > > 1) Is any one of the above options "hot", meaning that it can be used while > the > source cluster is running and that it produces a consistent backup (a snapshot > or checkpoint of the source cluster's data)? > I imagine only replication of the whole cluster (point 4) above) is really > "hot"? > Options 1) and 2) will give you a snapshot on a table at a particular instance in time. You'll get the state of the row at the time the MapReduce job crosses that row. St.Ack
