Hey, Sorry to resurrect this old thread, but working on the book update, I came across the same today, i.e. we have Merge and HMerge. I tried and Merge works fine now. It is also the only one of the two flagged as being a tool. Should HMerge be removed? At least deprecated?
Cheers, Lars On Thu, Jul 7, 2011 at 2:03 AM, Ted Yu <[email protected]> wrote: >>> there is already an issue to do this but not revamp of these Merge > classes > I guess the issue is HBASE-1621 > > On Wed, Jul 6, 2011 at 2:28 PM, Stack <[email protected]> wrote: > >> Yeah, can you file an issue Lars. This stuff is ancient and needs to >> be redone AND redone so we can do merging while table is online (there >> is already an issue to do this but not revamp of these Merge classes). >> The unit tests for Merge are also all junit3 and do whacky stuff to >> put up multiple regions. This should be redone too (they are often >> first thing broke when major change and putting them back together is >> a headache since they do not follow the usual pattern). >> >> St.Ack >> >> On Sun, Jul 3, 2011 at 12:38 AM, Lars George <[email protected]> >> wrote: >> > Hi Ted, >> > >> > The log is from an earlier attempt, I tried this a few times. This is all >> local, after rm'ing the /hbase. So the files are all pretty empty, but since >> I put data in I was assuming it should work. Once you gotten into this >> state, you also get funny error messages in the shell: >> > >> > hbase(main):001:0> list >> > TABLE >> > 11/07/03 09:36:21 INFO ipc.HBaseRPC: Using >> org.apache.hadoop.hbase.ipc.WritableRpcEngine for >> org.apache.hadoop.hbase.ipc.HMasterInterface >> > >> > ERROR: undefined method `map' for nil:NilClass >> > >> > Here is some help for this command: >> > List all tables in hbase. Optional regular expression parameter could >> > be used to filter the output. Examples: >> > >> > hbase> list >> > hbase> list 'abc.*' >> > >> > >> > hbase(main):002:0> >> > >> > I am assuming this is collateral, but why? The UI works but the table is >> gone too. >> > >> > Lars >> > >> > On Jul 2, 2011, at 10:55 PM, Ted Yu wrote: >> > >> >> There is TestMergeTool which tests Merge. >> >> >> >> From the log you provided, I got a little confused as why >> >> 'testtable,row-20,1309613053987.23a35ac696bdf4a8023dcc4c5b8419e0.' >> didn't >> >> appear in your command line or the output from .META. scanning. >> >> >> >> On Sat, Jul 2, 2011 at 10:36 AM, Lars George <[email protected]> >> wrote: >> >> >> >>> Hi, >> >>> >> >>> These two seem both in a bit of a weird state: HMerge is scoped package >> >>> local, therefore no one but the package can call the merge() >> functions... >> >>> and no one does that but the unit test. But it would be good to have >> this on >> >>> the CLI and shell as a command (and in the shell maybe with a >> confirmation >> >>> message?), but it is not available AFAIK. >> >>> >> >>> HMerge can merge regions of tables that are disabled. It also merges >> all >> >>> that qualify, i.e. where the merged region is less than or equal of >> half the >> >>> configured max file size. >> >>> >> >>> Merge on the other hand does have a main(), so can be invoked: >> >>> >> >>> $ hbase org.apache.hadoop.hbase.util.Merge >> >>> Usage: bin/hbase merge <table-name> <region-1> <region-2> >> >>> >> >>> Note how the help insinuates that you can use it as a tool, but that is >> not >> >>> correct. Also, it only merges two given regions, and the cluster must >> be >> >>> shut down (only the HBase daemons). So that is a step back. >> >>> >> >>> What is worse is that I cannot get it to work. I tried in the shell: >> >>> >> >>> hbase(main):001:0> create 'testtable', 'colfam1', {SPLITS => >> >>> ['row-10','row-20','row-30','row-40','row-50']} >> >>> 0 row(s) in 0.2640 seconds >> >>> >> >>> hbase(main):002:0> for i in '0'..'9' do for j in '0'..'9' do put >> >>> 'testtable', "row-#{i}#{j}", "colfam1:#{j}", "#{j}" end end >> >>> 0 row(s) in 1.0450 seconds >> >>> >> >>> hbase(main):003:0> flush 'testtable' >> >>> 0 row(s) in 0.2000 seconds >> >>> >> >>> hbase(main):004:0> scan '.META.', { COLUMNS => ['info:regioninfo']} >> >>> ROW COLUMN+CELL >> >>> testtable,,1309614509037.612d1e0112 column=info:regioninfo, >> >>> timestamp=130... >> >>> 406e6c2bb482eeaec57322. STARTKEY => '', ENDKEY => 'row-10' >> >>> testtable,row-10,1309614509040.2fba column=info:regioninfo, >> >>> timestamp=130... >> >>> fcc9bc6afac94c465ce5dcabc5d1. STARTKEY => 'row-10', ENDKEY => >> >>> 'row-20' >> >>> testtable,row-20,1309614509041.e7c1 column=info:regioninfo, >> >>> timestamp=130... >> >>> 6267eb30e147e5d988c63d40f982. STARTKEY => 'row-20', ENDKEY => >> >>> 'row-30' >> >>> testtable,row-30,1309614509041.a9cd column=info:regioninfo, >> >>> timestamp=130... >> >>> e1cbc7d1a21b1aca2ac7fda30ad8. STARTKEY => 'row-30', ENDKEY => >> >>> 'row-40' >> >>> testtable,row-40,1309614509041.d458 column=info:regioninfo, >> >>> timestamp=130... >> >>> 236feae097efcf33477e7acc51d4. STARTKEY => 'row-40', ENDKEY => >> >>> 'row-50' >> >>> testtable,row-50,1309614509041.74a5 column=info:regioninfo, >> >>> timestamp=130... >> >>> 7dc7e3e9602d9229b15d4c0357d1. STARTKEY => 'row-50', ENDKEY => '' >> >>> 6 row(s) in 0.0440 seconds >> >>> >> >>> hbase(main):005:0> exit >> >>> >> >>> $ ./bin/stop-hbase.sh >> >>> >> >>> $ hbase org.apache.hadoop.hbase.util.Merge testtable \ >> >>> testtable,row-20,1309614509041.e7c16267eb30e147e5d988c63d40f982. \ >> >>> testtable,row-30,1309614509041.a9cde1cbc7d1a21b1aca2ac7fda30ad8. >> >>> >> >>> But I get consistently errors: >> >>> >> >>> 11/07/02 07:20:49 INFO util.Merge: Merging regions >> >>> testtable,row-20,1309613053987.23a35ac696bdf4a8023dcc4c5b8419e0. and >> >>> testtable,row-30,1309613053987.3664920956c30ac5ff2a7726e4e6 in table >> >>> testtable >> >>> 11/07/02 07:20:49 INFO wal.HLog: HLog configuration: blocksize=32 MB, >> >>> rollsize=30.4 MB, enabled=true, optionallogflushinternal=1000ms >> >>> 11/07/02 07:20:49 INFO wal.HLog: New hlog >> >>> >> /Volumes/Macintosh-HD/Users/larsgeorge/.logs_1309616449171/hlog.1309616449181 >> >>> 11/07/02 07:20:49 INFO wal.HLog: getNumCurrentReplicas--HDFS-826 not >> >>> available; hdfs_out=org.apache.hadoop.fs.FSDataOutputStream@25961581, >> >>> >> exception=org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.getNumCurrentReplicas() >> >>> 11/07/02 07:20:49 INFO regionserver.HRegion: Setting up tabledescriptor >> >>> config now ... >> >>> 11/07/02 07:20:49 INFO regionserver.HRegion: Onlined >> -ROOT-,,0.70236052; >> >>> next sequenceid=1 >> >>> info: null >> >>> region1: [B@48fd918a >> >>> region2: [B@7f5e2075 >> >>> 11/07/02 07:20:49 FATAL util.Merge: Merge failed >> >>> java.io.IOException: Could not find meta region for >> >>> testtable,row-20,1309613053987.23a35ac696bdf4a8023dcc4c5b8419e0. >> >>> at >> >>> org.apache.hadoop.hbase.util.Merge.mergeTwoRegions(Merge.java:211) >> >>> at org.apache.hadoop.hbase.util.Merge.run(Merge.java:111) >> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> >>> at org.apache.hadoop.hbase.util.Merge.main(Merge.java:386) >> >>> 11/07/02 07:20:49 INFO regionserver.HRegion: Setting up tabledescriptor >> >>> config now ... >> >>> 11/07/02 07:20:49 INFO regionserver.HRegion: Onlined >> .META.,,1.1028785192; >> >>> next sequenceid=1 >> >>> 11/07/02 07:20:49 INFO regionserver.HRegion: Closed -ROOT-,,0.70236052 >> >>> 11/07/02 07:20:49 INFO wal.HLog: main.logSyncer exiting >> >>> 11/07/02 07:20:49 ERROR util.Merge: exiting due to error >> >>> java.lang.NullPointerException >> >>> at >> org.apache.hadoop.hbase.util.Merge$1.processRow(Merge.java:119) >> >>> at >> >>> >> org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(MetaUtils.java:229) >> >>> at >> >>> >> org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(MetaUtils.java:258) >> >>> at org.apache.hadoop.hbase.util.Merge.run(Merge.java:116) >> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> >>> at org.apache.hadoop.hbase.util.Merge.main(Merge.java:386) >> >>> >> >>> After which I most of the times have shot .META. with an error >> >>> >> >>> 2011-07-02 06:42:10,763 WARN org.apache.hadoop.hbase.master.HMaster: >> Failed >> >>> getting all descriptors >> >>> java.io.FileNotFoundException: No status for >> >>> hdfs://localhost:8020/hbase/.corrupt >> >>> at >> >>> >> org.apache.hadoop.hbase.util.FSUtils.getTableInfoModtime(FSUtils.java:888) >> >>> at >> >>> >> org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:122) >> >>> at >> >>> >> org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:149) >> >>> at >> >>> >> org.apache.hadoop.hbase.master.HMaster.getHTableDescriptors(HMaster.java:1429) >> >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >>> at >> >>> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> >>> at >> >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >>> at java.lang.reflect.Method.invoke(Method.java:597) >> >>> at >> >>> >> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:312) >> >>> at >> >>> >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1065) >> >>> >> >>> Lars >> > >> > >>
