We're running HBase replication successfully on a 500 TB (compressed - raw is about 2PB) cluster over a 60ms link across the country. I'd give it a thumbs up for dealing with loss of a cluster and being able to run applications in two places that can tolerate inconsistency from the asynchronous nature. ( http://hbase.apache.org/replication.html )
You'll still want some sort of snapshot / export to be able to recover from bugs / corruption which gets replicated. We're intending to try out hbase snapshots ( http://hbase.apache.org/book/ops.snapshots.html ) once we've deployed a release with support. I'd also recommend using a recent 0.94 release if possible. Dave On Thu, Oct 17, 2013 at 12:52 PM, hdev ml <[email protected]> wrote: > Hello all, > > We are looking at a solution for HBase backup, recovery and replication for > DR > > We did take a look at the HBase replication, but we are not sure whether it > is being used at large. > > Our data size in HBase is around 4TB. > > We were thinking of DB approach of Exporting Full Dump weekly and then > doing incremental exports on regular intervals, say around 2-3 times a day. > But soon realized that the data transfer of 4 TB to our DR site, with our > current bandwidth, will take around 100+ hours. > > Are there better solutions out there? What do large installations do? > > Any documentation? > > Please let me know > > Thanks > Harshad >
