Thanks Dave for your insights. will wait for the patch from CDH. Meanwhile will try the replication on our clusters.
On Thu, Oct 17, 2013 at 3:06 PM, Dave Latham <[email protected]> wrote: > Primarily the link above, but also got comfortable with the source code > after some bumps. You may want to consider moving to a more recent version > depending how much volume you're expecting to deal with. Here are a few > issues we bumped into and fixed since your release: HBASE-8096, HBASE-8806, > HBASE-9377 > > Dave > > > On Thu, Oct 17, 2013 at 2:52 PM, hdev ml <[email protected]> wrote: > > > Thanks Dave. Yes we are planning to take exports every now and then for > > safeguarding the data and also snapshots for local corruptions, data loss > > etc. > > > > Did you refer to any documentation other than the "hbase book"? > > > > We are on CDH4.4 - HBase 0.94.6, so I think we are good there. > > > > Thanks for your time Dave. > > > > Harshad > > > > > > On Thu, Oct 17, 2013 at 2:39 PM, Dave Latham <[email protected]> > wrote: > > > > > We're running HBase replication successfully on a 500 TB (compressed - > > raw > > > is about 2PB) cluster over a 60ms link across the country. I'd give > it a > > > thumbs up for dealing with loss of a cluster and being able to run > > > applications in two places that can tolerate inconsistency from the > > > asynchronous nature. ( http://hbase.apache.org/replication.html ) > > > > > > You'll still want some sort of snapshot / export to be able to recover > > from > > > bugs / corruption which gets replicated. We're intending to try out > > hbase > > > snapshots ( http://hbase.apache.org/book/ops.snapshots.html ) once > we've > > > deployed a release with support. > > > > > > I'd also recommend using a recent 0.94 release if possible. > > > > > > Dave > > > > > > > > > On Thu, Oct 17, 2013 at 12:52 PM, hdev ml <[email protected]> wrote: > > > > > > > Hello all, > > > > > > > > We are looking at a solution for HBase backup, recovery and > replication > > > for > > > > DR > > > > > > > > We did take a look at the HBase replication, but we are not sure > > whether > > > it > > > > is being used at large. > > > > > > > > Our data size in HBase is around 4TB. > > > > > > > > We were thinking of DB approach of Exporting Full Dump weekly and > then > > > > doing incremental exports on regular intervals, say around 2-3 times > a > > > day. > > > > But soon realized that the data transfer of 4 TB to our DR site, with > > our > > > > current bandwidth, will take around 100+ hours. > > > > > > > > Are there better solutions out there? What do large installations do? > > > > > > > > Any documentation? > > > > > > > > Please let me know > > > > > > > > Thanks > > > > Harshad > > > > > > > > > >
