[jira] [Commented] (HBASE-7460) Cleanup client connection layers

stack (JIRA) Sat, 29 Dec 2012 14:52:12 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541010#comment-13541010
 ]


stack commented on HBASE-7460:
------------------------------

I am in this area at the moment, at a level just above HBaseClient trying to 
make use of it.  I'm playing with trying to use protobuf Service and hooking it 
up on either end to use our RPC.  There are pros but a bunch of cons with the 
main one being mostly the amount of refactoring that would have to do in this 
area if we were to go this route.

My first impression submerging below the level of HBaseClientRPC is that there 
is a bunch of cruft in here, stuff that has been accumulating over time and 
that we've probably been afraid to apply the compressed air can too.

I want to make use of clients.  Was going to copy what is going on in Invoker 
not knowing any better.  I want to use something else than "protocol" as the 
key getting the client.

In my investigations, the first thing to jettison would be the proxy stuff.  In 
my case it is in the way (I'd use the protobuf Service.Stub instead).  Getting 
a proxy has a bunch of overrides.  A bunch look unused, as you say.  Also, 
protocol 'versioning' and protocol 'fingerprinting' -- VersionedProtocol and 
ProtocolSignature -- are in the former case not hooked up, and in the latter, a 
facility that is incomplete and unused so all this code needs finishing or we 
need to just throw it out.

bq. It seems to me like each HConnection(Implementation) instance should have 
it's own HBaseClient instance, doing away with the ClientCache mapping

Sounds imminently sensible.

I'd be up for sketching something out if you had a few minutes to hang G.

Still to do, though not directly related here but it is in this realm only at a 
lower level, is the back and forth over RPC, what we put on the wire.  As is 
where we create a pb from an already made request pb -- with the former making 
a copy of the latter -- needs fixing and we should take the opportunity to 
address some of the criticisms' Benoît/Tsuna raised in Unofficial Hadoop / 
HBase RPC protocol documentation 
(http://www.google.com/url?q=https%3A%2F%2Fgithub.com%2FOpenTSDB%2Fasynchbase%2Fblob%2Fmaster%2Fsrc%2FHBaseRpc.java%23L164&sa=D&sntz=1&usg=AFQjCNEy00ZQVclIR7BaBJYBdRV-i7QGTg)
                
> Cleanup client connection layers
> --------------------------------
>
>                 Key: HBASE-7460
>                 URL: https://issues.apache.org/jira/browse/HBASE-7460
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client, IPC/RPC
>            Reporter: Gary Helmling
>
> This issue originated from a discussion over in HBASE-7442.  We currently 
> have a broken abstraction with {{HBaseClient}}, where it is bound to a single 
> {{Configuration}} instance at time of construction, but then reused for all 
> connections to all clusters.  This is combined with multiple, overlapping 
> layers of connection caching.
> Going through this code, it seems like we have a lot of mismatch between the 
> higher layers and the lower layers, with too much abstraction in between. At 
> the lower layers, most of the {{ClientCache}} stuff seems completely unused. 
> We currently effectively have an {{HBaseClient}} singleton (for 
> {{SecureClient}} as well in 0.92/0.94) in the client code, as I don't see 
> anything that calls the constructor or {{RpcEngine.getProxy()}} versions with 
> a non-default socket factory. So a lot of the code around this seems like 
> built up waste.
> The fact that a single Configuration is fixed in the {{HBaseClient}} seems 
> like a broken abstraction as it currently stands. In addition to cluster ID, 
> other configuration parameters (max retries, retry sleep) are fixed at time 
> of construction. The more I look at the code, the more it looks like the 
> {{ClientCache}} and sharing the {{HBaseClient}} instance is an unnecessary 
> complication. Why cache the {{HBaseClient}} instances at all? In 
> {{HConnectionManager}}, we already have a mapping from {{Configuration}} to 
> {{HConnection}}. It seems to me like each {{HConnection(Implementation)}} 
> instance should have it's own {{HBaseClient}} instance, doing away with the 
> {{ClientCache}} mapping. This would keep each {{HBaseClient}} associated with 
> a single cluster/configuration and fix the current breakage from reusing the 
> same {{HBaseClient}} against different clusters.
> We need a refactoring of some of the interactions of 
> {{HConnection(Implementation)}}, {{HBaseRPC/RpcEngine}}, and {{HBaseClient}}. 
> Off hand, we might want to expose a separate {{RpcEngine.getClient()}} method 
> that returns a new {{RpcClient}} interface (implemented by {{HBaseClient}}) 
> and move the {{RpcEngine.getProxy()}}/{{stopProxy()}} implementations into 
> the client. So all proxy invocations can go through the same client, without 
> requiring the static client cache. I haven't fully thought this through, so I 
> could be missing other important aspects. But that approach at least seems 
> like a step in the right direction for fixing the client abstractions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7460) Cleanup client connection layers

Reply via email to