Today on the IRC channel we fixed it with Joost using Stack's tool in
HBASE-1867. This was caused by a file going missing in the META table
and we are still investigating why it happened.

So Joost, could you send us your NN's log so we can grep for the file names?

Thx,

J-D

On Thu, Nov 5, 2009 at 11:08 AM, Joost Ouwerkerk <[email protected]> wrote:
> Is there a way to rebuild the META?  I'm really hoping there's no data loss
> here, and it's just a question of META being out of sync with data...
> jo
>
> On Wed, Nov 4, 2009 at 7:07 PM, Joost Ouwerkerk <[email protected]>wrote:
>
>> I investigated following your guidance, Stack.  Unfortunately I am not
>> seeing evidence of double assignment. It looks more like a case of missing
>> assignment.  There appear to be key ranges that are not represented in the
>> .META. table.  So, I have a region that handles keys AAA to BBB, and the
>> next region handles DDD to EEE.  Now when I try to access key CCC, I get
>> routed to the region that handles AAA to BBB, presumably because my key is
>> after AAA and before DDD.  Then HRegion.checkRow fails because the requested
>> key is outside the region's range.
>>
>> Consider this error:
>>
>> org.apache.hadoop.hbase.regionserver.WrongRegionException:
>> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out
>> of range for HRegion
>> crawled_pages,r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fbasil-in-the-grove,
>> startKey
>> ='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fbasil-in-the-grove',
>> getEndKey()
>> ='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Feast-broward',
>> row
>> ='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fhavana-hideout'
>>
>> As the error points out, the requested row is outside the range for the
>> region.  In the .META. table, the next region starts at
>> 'r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fpashas-3'.  The request row
>> falls after one region's End key, and before the next region's Start key.
>>
>> jo
>>
>>
>> On Wed, Nov 4, 2009 at 4:56 PM, stack <[email protected]> wrote:
>>
>>> Meta is giving out the wrong address for a region?  Do a scan of .META.
>>>  It
>>> might be easier dumping the scan into a file so you can grep around:
>>>
>>> echo "scan '.META.'" | ./bin/hbase shell --format-width=300 &>
>>> /tmp/meta.txt
>>>
>>> Grep in here for the region that contains the row you are looking for.
>>>  What
>>> does it have for info:server?  Go to that regionserver (UI or log).  Is it
>>> carrying the region?  If not, thats what the WRE is about.
>>>
>>> For same region, grep its name in master log (hopefully you have DEBUG
>>> enabled).
>>>
>>> Whats its history?  Could it have been assigned to one server and then
>>> another?
>>>
>>> If so, close the region in both places.  Type 'tools' in the shell to see
>>> doc. on "close_region" command.  You can pass it server to pass the close
>>> message to.  Close in both places.
>>>
>>> If its a double-assignment issue, our name for above phenomeon, suggest
>>> you
>>> upgrade to 0.20.1.  It has at least one pointed fix for this scenario
>>> (HBASE-1878).
>>>
>>> St.Ack
>>>
>>>
>>> On Wed, Nov 4, 2009 at 12:35 PM, Joost Ouwerkerk <[email protected]
>>> >wrote:
>>>
>>> > HBase has started throwing WrongRegionExceptions at me when trying to
>>> > access
>>> > certain regions.  I'm guessing that the META table has somehow gone out
>>> of
>>> > sync with reality.  I've tried compacting and I've tried restarting, but
>>> > the
>>> > problem does not go away.  The errors are always on the same regions.
>>>  Has
>>> > anyone else seen this and succeeded at getting their table back into
>>> > working
>>> > order?
>>> >
>>> > *Example get:*
>>> >
>>> > org.apache.hadoop.hbase.regionserver.WrongRegionException:
>>> > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
>>> > out
>>> > of range for HRegion
>>> >
>>> >
>>> crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
>>> >
>>> >
>>> startKey='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F',
>>> >
>>> >
>>> getEndKey()='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fhermosa-beach\x2Fall-cuisines\x2Ftags\x2Foutdoor-dining\x2F',
>>> >
>>> >
>>> row='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F'
>>> >    at
>>> > org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1522)
>>> >    at
>>> >
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1554)
>>> >    at
>>> > org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:1622)
>>> >    at
>>> org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2278)
>>> >    at
>>> >
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1785)
>>> >    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>> >    at
>>> >
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >    at java.lang.reflect.Method.invoke(Method.java:597)
>>> >    at
>>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>>> >
>>> > *Example put:
>>> > *
>>> > put 'crawled_pages','r:
>>> > http://com.xxxx.yyyy/restaurants/all-areas/inglewood/all-cuisines/',
>>> > 'curi:test','test'
>>> > NativeException:
>>> org.apache.hadoop.hbase.client.RetriesExhaustedException:
>>> > Trying to contact region server Some server, retryOnlyOne=true, index=0,
>>> > islastrow=true, tries=4, numtries=5, i=0, listsize=1,
>>> >
>>> >
>>> region=crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084
>>> > for region
>>> >
>>> >
>>> crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
>>> > row
>>> >
>>> >
>>> 'r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F',
>>> > but failed after 5 attempts.
>>> > Exceptions:
>>> >
>>> >    from org/apache/hadoop/hbase/client/HConnectionManager.java:1119:in
>>> > `process'
>>> >    from org/apache/hadoop/hbase/client/HConnectionManager.java:1200:in
>>> > `processBatchOfRows'
>>> >    from org/apache/hadoop/hbase/client/HTable.java:605:in `flushCommits'
>>> >    from org/apache/hadoop/hbase/client/HTable.java:470:in `put'
>>> >    from org/apache/hadoop/hbase/client/HTable.java:1761:in `commit'
>>> >    from org/apache/hadoop/hbase/client/HTable.java:1742:in `commit'
>>> >    from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
>>> >    from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
>>> >    from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
>>> >    from java/lang/reflect/Method.java:597:in `invoke'
>>> >    from org/jruby/javasupport/JavaMethod.java:298:in
>>> > `invokeWithExceptionHandling'
>>> >    from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
>>> >    from org/jruby/java/invokers/InstanceMethodInvoker.java:44:in `call'
>>> >    from org/jruby/runtime/callsite/CachingCallSite.java:110:in `call'
>>> >    from org/jruby/ast/CallOneArgNode.java:57:in `interpret'
>>> >    from org/jruby/ast/NewlineNode.java:104:in `interpret'
>>> >
>>>
>>
>>
>

Reply via email to