Hi Kevin, Thanks for answering. What are your thoughts on copyTable vs export-import considering my use case. Will one tool have lesser chance of copying inconsistent data over another?
I wish to do increment copy of a live cluster to minimize downtime. On Tue, Oct 16, 2012 at 8:47 AM, Kevin O'dell <[email protected]>wrote: > Shrijeet, > > I think a better approach would be a pre-split table and then do the > export/import. This will save you from having to script the merges, which > can be end badly for META if done wrong. > > On Mon, Oct 15, 2012 at 5:31 PM, Shrijeet Paliwal > <[email protected]>wrote: > > > We moved to 0.92.2 some time ago and with that, increased the max file > size > > setting to 4GB (from 2GB). Also an application triggered cleanup > operation > > deleted lots of unwanted rows. > > These two combined have gotten us to a state where lots of regions are > > smaller than desired size. > > > > Merging regions two at a time seems time consuming and will be hard to > > automate. https://issues.apache.org/jira/browse/HBASE-1621 automates > > merging, but it is not stable. > > > > I am interested in knowing about other possible approaches folks have > > tried. What do you guys think about copyTable based approach ? (old > > ---copyTable---> new and then rename new to old) > > > > -Shrijeet > > > > > > -- > Kevin O'Dell > Customer Operations Engineer, Cloudera >
