[jira] Commented: (HBASE-2939) Allow Client-Side Connection Pooling

ryan rawson (JIRA) Mon, 28 Feb 2011 17:50:00 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000661#comment-13000661
 ]


ryan rawson commented on HBASE-2939:
------------------------------------

I ran this on a little cluster test and my results were a little mixed.  This 
is due to my set up and my benchmarking setup.

- using YCSB and a small working set I loaded data on Regionserver R1
- ran YCSB with small 300 row scans with 100 columns on Master M
- ran with both this patch, and without.

The test runs with this patch did not indicate significant improvement.  I 
think this was due to the fact that I was saturating the network between M and 
R1, and opening more sockets gave slight but not significant improvement.  I 
didn't capture the numbers, but it was something like 390 ms without and 340 ms 
with.

I ran in a fairly interesting case since I was trying to test the contention of 
the single socket, and it seems like adding more sockets to a saturated network 
did not improve like I had hoped to.

Could you paste in your test scenario?  I could see that for some scenarios 
involving small to medium gets, that this could provide an improvement to 
latency.

> Allow Client-Side Connection Pooling
> ------------------------------------
>
>                 Key: HBASE-2939
>                 URL: https://issues.apache.org/jira/browse/HBASE-2939
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.89.20100621
>            Reporter: Karthick Sankarachary
>            Assignee: ryan rawson
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: HBASE-2939-0.20.6.patch, HBASE-2939.patch, 
> HBASE-2939.patch
>
>
> By design, the HBase RPC client multiplexes calls to a given region server 
> (or the master for that matter) over a single socket, access to which is 
> managed by a connection thread defined in the HBaseClient class. While this 
> approach may suffice for most cases, it tends to break down in the context of 
> a real-time, multi-threaded server, where latencies need to be lower and 
> throughputs higher. 
> In brief, the problem is that we dedicate one thread to handle all 
> client-side reads and writes for a given server, which in turn forces them to 
> share the same socket. As load increases, this is bound to serialize calls on 
> the client-side. In particular, when the rate at which calls are submitted to 
> the connection thread is greater than that at which the server responds, then 
> some of those calls will inevitably end up sitting idle, just waiting their 
> turn to go over the wire.
> In general, sharing sockets across multiple client threads is a good idea, 
> but limiting the number of such sockets to one may be overly restrictive for 
> certain cases. Here, we propose a way of defining multiple sockets per server 
> endpoint, access to which may be managed through either a load-balancing or 
> thread-local pool. To that end, we define the notion of a SharedMap, which 
> maps a key to a resource pool, and supports both of those pool types. 
> Specifically, we will apply that map in the HBaseClient, to associate 
> multiple connection threads with each server endpoint (denoted by a 
> connection id). 
>  Currently, the SharedMap supports the following types of pools:
>     * A ThreadLocalPool, which represents a pool that builds on the 
> ThreadLocal class. It essentially binds the resource to the thread from which 
> it is accessed.
>     * A ReusablePool, which represents a pool that builds on the LinkedList 
> class. It essentially allows resources to be checked out, at which point it 
> is (temporarily) removed from the pool. When the resource is no longer 
> required, it should be returned to the pool in order to be reused.
>     * A RoundRobinPool, which represents a pool that stores its resources in 
> an ArrayList. It load-balances access to its resources by returning a 
> different resource every time a given key is looked up.
> To control the type and size of the connection pools, we give the user a 
> couple of parameters (viz. "hbase.client.ipc.pool.type" and 
> "hbase.client.ipc.pool.size"). In case the size of the pool is set to a 
> non-zero positive number, that is used to cap the number of resources that a 
> pool may contain for any given key. A size of Integer#MAX_VALUE is 
> interpreted to mean an unbounded pool.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-2939) Allow Client-Side Connection Pooling

Reply via email to