[ 
https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194357#comment-13194357
 ] 

Lars Hofhansl commented on HBASE-5153:
--------------------------------------

So here's the problem. This is hanging while validating that HBase is not 
running via HBaseAdmin.checkHBaseAvailable, which just attempts to create a new 
HBaseAdmin after it sets hbase.client.retries.number to 1. However 
HConnectionImpl caches hbase.client.retries.number in numRetries, and hence if 
ZK is not running resetZooKeeperTrackersWithRetries will retry for a while.
The simplest fix would be for resetZooKeeperTrackersWithRetries to ignore he 
cached setting and to retrieve the value again from the setting. While I am at 
it, I'll also add another option to a different number of retries here.
                
> Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
> -------------------------------------------------------------------
>
>                 Key: HBASE-5153
>                 URL: https://issues.apache.org/jira/browse/HBASE-5153
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.90.4
>            Reporter: Jieshan Bean
>            Assignee: Jieshan Bean
>             Fix For: 0.94.0, 0.90.6, 0.92.1
>
>         Attachments: 5153-92.txt, 5153-trunk.txt, 5153-trunk.txt, 
> HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, 
> HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, 
> HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, 
> HBASE-5153.patch, TestResults-hbase5153.out
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads 
> share a same connection, once this connection got abort in one thread, the 
> other threads will got a 
> "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal 
> HTable instance cann't be continue to use. The connection in HTable should be 
> recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable 
> instance. We can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to