I have created HBASE-7990 for backport. On Mon, Mar 4, 2013 at 4:28 AM, Ted <[email protected]> wrote:
> We should back port. > > Thanks > > On Mar 4, 2013, at 4:16 AM, "Kevin O'dell" <[email protected]> > wrote: > > > Ted, > > > > This is the second or third time I have seen this. I think it should > > apply fairly clean. What do you think? > > > > On Sat, Mar 2, 2013 at 10:47 AM, Ted Yu <[email protected]> wrote: > > > >> HBASE-5837 is only in 0.95 and later. > >> > >> Do you want HBASE-5837 to be backported ? > >> > >> Thanks > >> > >> On Thu, Feb 7, 2013 at 3:17 PM, Brandon Peskin <[email protected]> > wrote: > >> > >>> Thanks Kevin. > >>> > >>> Before I tried your advice, I tried this: > >>> > >>> scan '.META.', { FILTER => > >>> org.apache.hadoop.hbase.filter.SingleColumnValueFilter.new > >>> (org.apache.hadoop.hbase.util.Bytes.toBytes('info'), > >>> org.apache.hadoop.hbase.util.Bytes.toBytes('regioninfo'), > >> > org.apache.hadoop.hbase.filter.CompareFilter::CompareOp.valueOf('NOT_EQUAL'), > >>> org.apache.hadoop.hbase.filter.SubstringComparator.new('algol'))} > >>> > >>> deleteall '.META.', '<row_key>' > >>> > >>> > >>> The problem is at some point I fat-fingereda row key and believe I hit > >>> HBASE-5837 > >>> > >>> https://issues.apache.org/jira/browse/HBASE-5837 > >>> > >>> I'm getting java.io.IOException: java.io.IOException: > >>> java.lang.IllegalArgumentException: No 44 in > >>> <2e40c841-af5b-4a5e-be0f-e06a953f05cc,1359958540596>, length=13, > >> offset=37 > >>> Caused my master to die, can't restart it. > >>> > >>> Is there any way around this or have I completely hosed my hbase > >>> installation? > >>> > >>> > >>> On Jan 31, 2013, at 6:23 AM, Kevin O'dell <[email protected]> > >>> wrote: > >>> > >>>> I am going to disagree with ignoring the error. You will encounter > >>>> failures when doing other operations such as import/exports. The > first > >>>> thing I would do is like JM said, lets focus on the region that is not > >> in > >>>> META(we at least want 0 inconsistencies). Can you please run hbck > >>> -repair > >>>> and then run another -details and let us know if you are still seeing > >>>> errors? After that, if you are still getting the NULL errors for > >>>> hregion:info in META. Can you please run echo "scan '.META.'" | hbase > >>> shell > >>>>> meta.out and attach the meta.out file. I would like to take a look > at > >>>> some of these. > >>>> > >>>> To be able to run the -repair we will want to use a different jar and > >>> some > >>>> instructions: > >>>> > >>>> > >>>> 1. Move the new uber jar on to the system. > >>>> hbase-0.90.4-cdh3u3-patch30+3.jar > >>>> 2. Copy the hbase dir(/usr/lib/hbase) into /tmp/hbase dir. > >>>> 3. Move the hbase jar(hbase-0.90.4-cdh3u2.jar) to a .old from the tmp > >>>> and replace it with the uber-hbck(hbase-0.90.4-cdh3u3-patch30+3.jar). > >>>> 4. break the sym links that directory > >>>> 5. Add the value of fs.default.name from the core-site.xml to the > >>>> HBase-site.xml > >>>> 6. export HBASE_HOME=/tmp/hbase/ and run ./bin/hbase hbck -details > >> 2>&1 > >>>> | tee details.out. > >>>> 7. Check the details.out and make sure you are still seeing > >>>> inconsistencies > >>>> 8. ./bin/hbase hbck -repair 2>&1 | tee repair.out. > >>>> 9. Run -details again and make sure we have 0 inconsistencies. > >> > https://www.dropbox.com/s/fxotosglrrl1tq2/hbase-0.90.4-cdh3u3-patch30%2B3.jar > >>>> <--- new jar > >>>> > >>>> On Thu, Jan 31, 2013 at 6:48 AM, Jean-Marc Spaggiari < > >>>> [email protected]> wrote: > >>>>> > >>>>> Hi Brandon, > >>>>> > >>>>> I faced the same issue for "HRegionInfo was null or empty" on January > >>>>> 24th and Ted replied: > >>>>> > >>>>> "Encountered problems when prefetch META table: > >>>>> > >>>>> You can ignore the warning." > >>>>> > >>>>> So I think you should focus on the last one "not listed in META or > >>>>> deployed on any region server". > >>>>> > >>>>> Have you tried hbck to see if it can fix it? > >>>>> > >>>>> JM > >>>>> > >>>>> 2013/1/31, Brandon Peskin <[email protected]>: > >>>>>> hadoop 0.20.2-cdh3u2 > >>>>>> hbase 0.90.4-cdh3u2 > >>>>>> > >>>>>> On January 8th I had a network event where I lost three region > >> servers. > >>>>>> > >>>>>> When they came back I had unassigned regions/regions not being > served > >>>> errors > >>>>>> which I fixed with the hbck -fix > >>>>>> > >>>>>> > >>>>>> > >>>>>> Since then, however I have been getting an increasing number of > these > >>>> when I > >>>>>> have clients trying to write to specific tables: > >>>>>> > >>>>>> > >>>>>> java.io.IOException: HRegionInfo was null or empty in Meta for > >>>>>> algol_profile_training_record, > >> > row=algol_profile_training_record,clientcode:49128:abce6d9f-1ee2-434a-8a82-a151b7dc183f,99999999999999 > >>>>>> at > >> > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:142) > >>>>>> ~[hbase-0.90.4-cdh3u2.jar:na] > >>>>>> at > >> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95) > >>>>>> ~[hbase-0.90.4-cdh3u2.jar:na] > >>>>>> at > >> > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:649) > >>>>>> [hbase-0.90.4-cdh3u2.jar:na] > >>>>>> > >>>>>> > >>>>>> > >>>>>> ....to the point now where I seemingly can't even write to that > >> table. > >>>>>> > >>>>>> This also coincides with the following, seen in the the hbase master > >>>> log: > >>>>>> > >>>>>> 13/01/31 02:49:55 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER > is > >>>> empty > >>>>>> in > >> > keyvalues={algol_tmp_client_39691_20121101053400_6,,1351783324210.b703b203d6189cad224764853cbd88f7./info:server/1351785275123/Put/vlen=32, > >> > algol_tmp_clientcode_39691_20121101053400_6,,1351783324210.b703b203d6189cad224764853cbd88f7./info:serverstartcode/1351785275123/Put/vlen=8} > >>>>>> 13/01/31 02:49:55 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER > is > >>>> empty > >>>>>> in > >> > keyvalues={algol_tmp_client_39691_20121101053400_6,\x7F\xD6\xA5Pm\xF2E\x9D\x81\xB1b\xD1'\xD3\xE5\xA4,1351783324210.507bdf342aaae45644d597c272c52e02./info:server/1351785275128/Put/vlen=32, > >> > algol_tmp_clientcode_39691_20121101053400_6,\x7F\xD6\xA5Pm\xF2E\x9D\x81\xB1b\xD1'\xD3\xE5\xA4,1351783324210.507bdf342aaae45644d597c272c52e02./info:serverstartcode/1351785275128/Put/vlen=8} > >>>>>> 13/01/31 02:49:55 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER > is > >>>> empty > >>>>>> in > >> > keyvalues={algol_tmp_client_34785_20121004102759_0,,1349383219155.a45774fa4c8af765df87d3373d30cfdc./info:server/1349461385375/Put/vlen=32, > >> > algol_tmp_overstock_34785_20121004102759_0,,1349383219155.a45774fa4c8af765df87d3373d30cfdc./info:serverstartcode/1349461385375/Put/vlen=8} > >>>>>> > >>>>>> > >>>>>> ...and there's 58 of these I cannot fix (presumably I could with the > >>>> hbck > >>>>>> provided with CDH4): > >>>>>> > >>>>>> ERROR: Region > >> > hdfs://namenode:9000/hbase/algol_profile_training_record/fdaa1024d1b725b4997d2283640f0fa4 > >>>>>> on HDFS, but not listed in META or deployed on any region server > >>>>>> > >>>>>> > >>>>>> I'm absolutely stumped. I've done some poking around and I can't > find > >>>> any > >>>>>> sort of data surrounding this issue with the exception of similar > >>>> symptoms > >>>>>> in an exchange on this list of March this year (though the inquirer > >> had > >>>>>> different questions). Any help would be appreciated, though I > >> suspect I > >>>> will > >>>>>> be told 'Upgrade to CDH4' or 'Drop and re-create the table'. > >>>>>> > >>>>>> Thanks in advance. > >>>>>> > >>>>>> -- > >>>>>> Brandon Peskin > >>>>>> Senior Systems Administrator > >>>>>> Adobe Systems > >>>>>> [email protected] > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Kevin O'Dell > >>>> Customer Operations Engineer, Cloudera > > > > > > > > -- > > Kevin O'Dell > > Customer Operations Engineer, Cloudera >
