[
https://issues.apache.org/jira/browse/HBASE-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065278#comment-13065278
]
M. C. Srivas commented on HBASE-4087:
-------------------------------------
@Ted:
{quote}
First, waiting for reference count to reach zero wasn't part of the original
semantics of obtaining connection. Suppose there were two clients A and B.
Client A is waiting in the new while loop for reference count to reach zero.
What if client B had a bug and crashed ?
{quote}
If many threads share a connection, the reference count protects the connection
from being destroyed until the last reference is dropped. We need to continue
to preserve that guarantee. With the change being proposed, if any thread calls
HConnectionManager#dropStaleConnection, the connection gets destroyed from
underneath other threads. The sequence of steps that causes this is as follows:
1. threads T1 and T2 both get connections using the same conf, thus are
returned the same connection.
2. T1 gets a failure, so it deletes the stale connection and creates a new one
(thus HBASE_INSTANCES.get() will return the new good connection, since the
HConnectionKey is identical).
3. T2 still has a handle on the old one, and gets a failure, and calls
deleteStaleConnection.
4. deleteStaleConnection destroys the good connection created in step 2, from
underneath T1.
So either we wait for the last ref to become zero, or set some sort of
connection#state == INVALID if you'd rather not wait. I think the second choice
(not waiting) seems better.
I didn't follow your second comment about client A and client B. Both are in
the same JVM, so if one crashes, the other would too, would it not?
> HBaseAdmin should perform validation of connection it holds
> -----------------------------------------------------------
>
> Key: HBASE-4087
> URL: https://issues.apache.org/jira/browse/HBASE-4087
> Project: HBase
> Issue Type: Bug
> Reporter: Ted Yu
> Assignee: Ted Yu
> Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4087-v2.txt, 4087-v3.txt, 4087.txt
>
>
> Through HBASE-3777, HConnectionManager reuses the connection to HBase servers.
> One challenge, discovered in troubleshooting HBASE-4052, is how we invalidate
> connection(s) to server which gets restarted.
> There're at least two ways.
> 1. HConnectionManager utilizes background thread(s) to periodically perform
> validation of connections in HBASE_INSTANCES and remove stale connection(s).
> 2. Allow HBaseClient (including HBaseAdmin) to provide feedback to
> HConnectionManager.
> The solution can be a combination of both of the above.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira