[
https://issues.apache.org/jira/browse/HBASE-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benoit Sigoure updated HBASE-2121:
----------------------------------
Summary: HBase client doesn't retry the right number of times when a region
is unavailable (was: HBase client doesn't retry the right number of times when
a region server is unavailable)
Actually, this issue doesn't require a region *server* to be unavailable, just
a region itself.
> HBase client doesn't retry the right number of times when a region is
> unavailable
> ---------------------------------------------------------------------------------
>
> Key: HBASE-2121
> URL: https://issues.apache.org/jira/browse/HBASE-2121
> Project: Hadoop HBase
> Issue Type: Bug
> Components: client
> Affects Versions: 0.20.2, 0.21.0
> Reporter: Benoit Sigoure
>
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries
> retries 10 times (by default). It ends up calling
> HConnectionManager$TableServers.locateRegionInMeta, which retries 10 times on
> its own. So the HBase client is effectively retrying 100 times before giving
> up, instead of 10 (10 is the default hbase.client.retries.number).
> I'm using hbase trunk HEAD. I verified this bug is also in 0.20.2.
> Sample call stack:
> org.apache.hadoop.hbase.client.RegionOfflineException: region offline:
> mytable,,1263421423787
> at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:709)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:640)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:609)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocation(HConnectionManager.java:430)
> at
> org.apache.hadoop.hbase.client.ServerCallable.instantiateServer(ServerCallable.java:57)
> at
> org.apache.hadoop.hbase.client.ScannerCallable.instantiateServer(ScannerCallable.java:62)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1047)
> at
> org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:836)
> at
> org.apache.hadoop.hbase.client.HTable$ClientScanner.initialize(HTable.java:756)
> at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:354)
> at <my application>
> How to reproduce:
> with a trivial HBase client (mine was just trying to scan the table), start
> the client, take offline the table the client uses, tell the client to start
> the scan. The client will not give up after 10 attempts, unlike what it's
> supposed to do.
> If locateRegionInMeta is only ever called from getRegionServerWithRetries,
> then the fix is trivial: just remove the retry logic in there. If it has
> some other callers who possibly relied on the retry logic in
> locateRegionInMeta, then the fix is going to be a bit more involved.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.