hadoop 0.20.2-cdh3u2
hbase 0.90.4-cdh3u2
On January 8th I had a network event where I lost three region servers.
When they came back I had unassigned regions/regions not being served errors
which I fixed with the hbck -fix
Since then, however I have been getting an increasing number of these when I
have clients trying to write to specific tables:
java.io.IOException: HRegionInfo was null or empty in Meta for
algol_profile_training_record,
row=algol_profile_training_record,clientcode:49128:abce6d9f-1ee2-434a-8a82-a151b7dc183f,99999999999999
at
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:142)
~[hbase-0.90.4-cdh3u2.jar:na]
at
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95)
~[hbase-0.90.4-cdh3u2.jar:na]
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:649)
[hbase-0.90.4-cdh3u2.jar:na]
....to the point now where I seemingly can't even write to that table.
This also coincides with the following, seen in the the hbase master log:
13/01/31 02:49:55 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in
keyvalues={algol_tmp_client_39691_20121101053400_6,,1351783324210.b703b203d6189cad224764853cbd88f7./info:server/1351785275123/Put/vlen=32,
algol_tmp_clientcode_39691_20121101053400_6,,1351783324210.b703b203d6189cad224764853cbd88f7./info:serverstartcode/1351785275123/Put/vlen=8}
13/01/31 02:49:55 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in
keyvalues={algol_tmp_client_39691_20121101053400_6,\x7F\xD6\xA5Pm\xF2E\x9D\x81\xB1b\xD1'\xD3\xE5\xA4,1351783324210.507bdf342aaae45644d597c272c52e02./info:server/1351785275128/Put/vlen=32,
algol_tmp_clientcode_39691_20121101053400_6,\x7F\xD6\xA5Pm\xF2E\x9D\x81\xB1b\xD1'\xD3\xE5\xA4,1351783324210.507bdf342aaae45644d597c272c52e02./info:serverstartcode/1351785275128/Put/vlen=8}
13/01/31 02:49:55 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in
keyvalues={algol_tmp_client_34785_20121004102759_0,,1349383219155.a45774fa4c8af765df87d3373d30cfdc./info:server/1349461385375/Put/vlen=32,
algol_tmp_overstock_34785_20121004102759_0,,1349383219155.a45774fa4c8af765df87d3373d30cfdc./info:serverstartcode/1349461385375/Put/vlen=8}
...and there's 58 of these I cannot fix (presumably I could with the hbck
provided with CDH4):
ERROR: Region
hdfs://namenode:9000/hbase/algol_profile_training_record/fdaa1024d1b725b4997d2283640f0fa4
on HDFS, but not listed in META or deployed on any region server
I'm absolutely stumped. I've done some poking around and I can't find any sort
of data surrounding this issue with the exception of similar symptoms in an
exchange on this list of March this year (though the inquirer had different
questions). Any help would be appreciated, though I suspect I will be told
'Upgrade to CDH4' or 'Drop and re-create the table'.
Thanks in advance.
--
Brandon Peskin
Senior Systems Administrator
Adobe Systems
[email protected]