[ 
https://issues.apache.org/jira/browse/HBASE-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557702#comment-13557702
 ] 

Gary Helmling commented on HBASE-7460:
--------------------------------------

bq. can I currently create two Connections and connect them to different 
clusters in a single JVM?

Prior to this patch, you'll have problems with that, due to the issue detailed 
in HBASE-7442 -- we essentially have an HBaseClient singleton, whose 
Configuration (including cluster ID) is fixed the first time it's used, so it 
can't properly account for different IDs per cluster.  This is important for 
token based authentication at least, ie. security + map reduce.  Outside of 
token auth, the current code may work for multiple cluster connections, though 
the client may pick up some wrong configs for things like retries, tcpnodelay, 
etc.  HBASE-7442 works around this by caching a separate HBaseClient per 
cluster ID, which makes token auth work.

With this change, since HBaseClient is fixed to a given HConnection, which is 
cached based on cluster-specific config, so it should be no problem connecting 
to multiple clusters within the same JVM.  There is a change in behavior here 
-- if you're manually creating multiple HConnections as the same user to the 
same cluster (using HCM#createConnection(Configuration)), you're now getting a 
separate HBaseClient per HConnection, each with it's own set of RPC 
connections.  Previously, due to the separate HBaseClient caching, these would 
share the same HBaseClient with the same RPC connections.  The new behavior 
seems simpler and more predictable to me.  Are there any cases where we 
anticipate problems from this?
                
> Cleanup client connection layers
> --------------------------------
>
>                 Key: HBASE-7460
>                 URL: https://issues.apache.org/jira/browse/HBASE-7460
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client, IPC/RPC
>            Reporter: Gary Helmling
>            Assignee: Gary Helmling
>             Fix For: 0.96.0
>
>         Attachments: HBASE-7460_2.patch
>
>
> This issue originated from a discussion over in HBASE-7442.  We currently 
> have a broken abstraction with {{HBaseClient}}, where it is bound to a single 
> {{Configuration}} instance at time of construction, but then reused for all 
> connections to all clusters.  This is combined with multiple, overlapping 
> layers of connection caching.
> Going through this code, it seems like we have a lot of mismatch between the 
> higher layers and the lower layers, with too much abstraction in between. At 
> the lower layers, most of the {{ClientCache}} stuff seems completely unused. 
> We currently effectively have an {{HBaseClient}} singleton (for 
> {{SecureClient}} as well in 0.92/0.94) in the client code, as I don't see 
> anything that calls the constructor or {{RpcEngine.getProxy()}} versions with 
> a non-default socket factory. So a lot of the code around this seems like 
> built up waste.
> The fact that a single Configuration is fixed in the {{HBaseClient}} seems 
> like a broken abstraction as it currently stands. In addition to cluster ID, 
> other configuration parameters (max retries, retry sleep) are fixed at time 
> of construction. The more I look at the code, the more it looks like the 
> {{ClientCache}} and sharing the {{HBaseClient}} instance is an unnecessary 
> complication. Why cache the {{HBaseClient}} instances at all? In 
> {{HConnectionManager}}, we already have a mapping from {{Configuration}} to 
> {{HConnection}}. It seems to me like each {{HConnection(Implementation)}} 
> instance should have it's own {{HBaseClient}} instance, doing away with the 
> {{ClientCache}} mapping. This would keep each {{HBaseClient}} associated with 
> a single cluster/configuration and fix the current breakage from reusing the 
> same {{HBaseClient}} against different clusters.
> We need a refactoring of some of the interactions of 
> {{HConnection(Implementation)}}, {{HBaseRPC/RpcEngine}}, and {{HBaseClient}}. 
> Off hand, we might want to expose a separate {{RpcEngine.getClient()}} method 
> that returns a new {{RpcClient}} interface (implemented by {{HBaseClient}}) 
> and move the {{RpcEngine.getProxy()}}/{{stopProxy()}} implementations into 
> the client. So all proxy invocations can go through the same client, without 
> requiring the static client cache. I haven't fully thought this through, so I 
> could be missing other important aspects. But that approach at least seems 
> like a step in the right direction for fixing the client abstractions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to