[ 
https://issues.apache.org/jira/browse/HBASE-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931769#action_12931769
 ] 

Benoit Sigoure commented on HBASE-2121:
---------------------------------------

Hey Gary, if you have a multi-threaded HBase app, I recommend you take a look 
at asynchbase (https://github.com/stumbleupon/asynchbase).  It's an alternative 
HBase client that was designed to be thread-safe and non-blocking from the 
ground up.

> HBase client doesn't retry the right number of times when a region is 
> unavailable
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-2121
>                 URL: https://issues.apache.org/jira/browse/HBASE-2121
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.2, 0.90.0
>            Reporter: Benoit Sigoure
>
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries
>  retries 10 times (by default).   It ends up calling 
> HConnectionManager$TableServers.locateRegionInMeta, which retries 10 times on 
> its own.  So the HBase client is effectively retrying 100 times before giving 
> up, instead of 10 (10 is the default hbase.client.retries.number).
> I'm using hbase trunk HEAD.  I verified this bug is also in 0.20.2.
> Sample call stack:
>  org.apache.hadoop.hbase.client.RegionOfflineException: region offline: 
> mytable,,1263421423787
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:709)
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:640)
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:609)
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocation(HConnectionManager.java:430)
>       at 
> org.apache.hadoop.hbase.client.ServerCallable.instantiateServer(ServerCallable.java:57)
>       at 
> org.apache.hadoop.hbase.client.ScannerCallable.instantiateServer(ScannerCallable.java:62)
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1047)
>       at 
> org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:836)
>       at 
> org.apache.hadoop.hbase.client.HTable$ClientScanner.initialize(HTable.java:756)
>       at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:354)
>       at <my application>
> How to reproduce:
> with a trivial HBase client (mine was just trying to scan the table), start 
> the client, take offline the table the client uses, tell the client to start 
> the scan.  The client will not give up after 10 attempts, unlike what it's 
> supposed to do.
> If locateRegionInMeta is only ever called from getRegionServerWithRetries, 
> then the fix is trivial: just remove the retry logic in there.  If it has 
> some other callers who possibly relied on the retry logic in 
> locateRegionInMeta, then the fix is going to be a bit more involved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to