[ 
https://issues.apache.org/jira/browse/HBASE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904051#action_12904051
 ] 

ryan rawson commented on HBASE-2939:
------------------------------------

I can see how using a single thread to read and write and serialize all data
to a single regionserver could make things slow. I +1 the general approach
here. Thanks for doing this!

On Aug 29, 2010 3:52 PM, "Karthick Sankarachary (JIRA)" <[email protected]>
https://issues.apache.org/jira/browse/HBASE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904047#action_12904047]
throughputs) when running our load test, especially with low think times.
While it is true that we can having multiple outstanding calls, we do
synchronize the send and receive legs of the call (see snippets below), and
I believe that serializes the calls to some extent.
server (or the master for that matter) over a single socket, access to which
is managed by a connection thread defined in the HBaseClient class. While
this approach may suffice for most cases, it tends to break down in the
context of a real-time, multi-threaded server, where latencies need to be
lower and throughputs higher.
client-side reads and writes for a given server, which in turn forces them
to share the same socket. As load increases, this is bound to serialize
calls on the client-side. In particular, when the rate at which calls are
submitted to the connection thread is greater than that at which the server
responds, then some of those calls will inevitably end up sitting idle, just
waiting their turn to go over the wire.
idea, but limiting the number of such sockets to one may be overly
restrictive for certain cases. Here, we propose a way of defining multiple
sockets per server endpoint, access to which may be managed through either a
load-balancing or thread-local pool. To that end, we define the notion of a
SharedMap, which maps a key to a resource pool, and supports both of those
pool types. Specifically, we will apply that map in the HBaseClient, to
associate multiple connection threads with each server endpoint (denoted by
a connection id).
ThreadLocal class. It essentially binds the resource to the thread from
which it is accessed.
class. It essentially allows resources to be checked out, at which point it
is (temporarily) removed from the pool. When the resource is no longer
required, it should be returned to the pool in order to be reused.
an ArrayList. It load-balances access to its resources by returning a
different resource every time a given key is looked up.
couple of parameters (viz. "hbase.client.ipc.pool.type" and
"hbase.client.ipc.pool.size"). In case the size of the pool is set to a
non-zero positive number, that is used to cap the number of resources that a
pool may contain for any given key. A size of Integer#MAX_VALUE is
interpreted to mean an unbounded pool.


> Allow Client-Side Connection Pooling
> ------------------------------------
>
>                 Key: HBASE-2939
>                 URL: https://issues.apache.org/jira/browse/HBASE-2939
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.89.20100621
>            Reporter: Karthick Sankarachary
>         Attachments: HBASE-2939.patch
>
>
> By design, the HBase RPC client multiplexes calls to a given region server 
> (or the master for that matter) over a single socket, access to which is 
> managed by a connection thread defined in the HBaseClient class. While this 
> approach may suffice for most cases, it tends to break down in the context of 
> a real-time, multi-threaded server, where latencies need to be lower and 
> throughputs higher. 
> In brief, the problem is that we dedicate one thread to handle all 
> client-side reads and writes for a given server, which in turn forces them to 
> share the same socket. As load increases, this is bound to serialize calls on 
> the client-side. In particular, when the rate at which calls are submitted to 
> the connection thread is greater than that at which the server responds, then 
> some of those calls will inevitably end up sitting idle, just waiting their 
> turn to go over the wire.
> In general, sharing sockets across multiple client threads is a good idea, 
> but limiting the number of such sockets to one may be overly restrictive for 
> certain cases. Here, we propose a way of defining multiple sockets per server 
> endpoint, access to which may be managed through either a load-balancing or 
> thread-local pool. To that end, we define the notion of a SharedMap, which 
> maps a key to a resource pool, and supports both of those pool types. 
> Specifically, we will apply that map in the HBaseClient, to associate 
> multiple connection threads with each server endpoint (denoted by a 
> connection id). 
>  Currently, the SharedMap supports the following types of pools:
>     * A ThreadLocalPool, which represents a pool that builds on the 
> ThreadLocal class. It essentially binds the resource to the thread from which 
> it is accessed.
>     * A ReusablePool, which represents a pool that builds on the LinkedList 
> class. It essentially allows resources to be checked out, at which point it 
> is (temporarily) removed from the pool. When the resource is no longer 
> required, it should be returned to the pool in order to be reused.
>     * A RoundRobinPool, which represents a pool that stores its resources in 
> an ArrayList. It load-balances access to its resources by returning a 
> different resource every time a given key is looked up.
> To control the type and size of the connection pools, we give the user a 
> couple of parameters (viz. "hbase.client.ipc.pool.type" and 
> "hbase.client.ipc.pool.size"). In case the size of the pool is set to a 
> non-zero positive number, that is used to cap the number of resources that a 
> pool may contain for any given key. A size of Integer#MAX_VALUE is 
> interpreted to mean an unbounded pool.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to