Thanks Stack. We are going to test this on a test table in QA, but I'd still like a fallback plan if something goes wrong when we eventually do it in prod.
One idea I had was to snapshot the table, clone from the snapshot, and perform the merge on the result of the clone. I imagine I'd first want to major compact the clone, so that we rewrite all of the linked files into new files. I also see at the end of this blog post ( http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/) that merging regions on a snapshot table can cause data loss. Does my approach sound reasonable? Disable table, snapshot table, create clone from snapshot, major compact clone, run merge on clone, enable clone, test, if fail fall-back to original table. On Wed, Aug 14, 2013 at 1:32 AM, Stack <[email protected]> wrote: > On Tue, Aug 13, 2013 at 5:17 PM, Bryan Beaudreault < > [email protected] > > wrote: > > > I'm running cdh4.2 hbase 0.94.2, and am looking to merge some regions in > a > > table. Looking at Merge.java, it seems to require that the entire > cluster > > be offline. However, I also notice an HMerge.java which doesn't appear > to > > do the same validation. > > > > Two questions: > > > > 1) Why does Merge.java validate the entire cluster is down, as opposed to > > just the single table being disabled? > > > > > It is dumb/simple/old. > > > > > 2) Could I write my own tool that uses HMerge, so as to merge regions in > > the disabled table without bringing the whole cluster down? > > > > > Yes. You can't do much harm if table is offline. > > St.Ack >
