See. https://issues.apache.org/jira/browse/HBASE-6580

The new proposed API looks like this:

Here's the proposed new API:
* HConnectionManager:
    public static HConnection createConnection(Configuration conf)
    public static HConnection createConnection(Configuration conf, 
ExecutorService pool)

* HConnection:
    public HTableInterface getTable(byte[] tableName) throws IOException
    public HTableInterface getTable(byte[] tableName, ExecutorService pool) 
throws IOException
    public HTableInterface getTable(String tableName) throws IOException

By default HConnectionImplementation will create an ExecutorService when 
needed. The ExecutorService can optionally passed be passed in.
HTableInterfaces are retrieved from the HConnection. By default the 
HConnection's ExecutorService is used, but optionally that can be overridden 
for each HTable.

In 0.98/trunk:

1. HTablePool will be removed. It is not longer needed.
2. All constructors in HTable will be removed and changed to be protected. All 
code use HTableInterface only.
3. HConnectionManager.getConnection() will be removed.
3. All HConnection caching (deleteConnection, etc,etc) will be removed, as it 
is no longer needed.


The new flow of setting up a client would look like this:

----- Snip -----
// connection to the cluster
HConnection conn = HConnectionManager.createConnection(conf);
...
// When the cluster connection is established get an HTableInterface for each 
operation or thread.
// HConnection.getTable(...) is lightweight. The table is really just a 
convenient place to call table method and for a temporary batch cache.
// It is in fact less overhead than HTablePool had when retrieving a cached 
HTable.
// The HTableInterface returned is not thread safe as before.
// It's fine to get 1000's of these.
// Don't cache the longer than the lifetime of the HConnection
HTableInterface table = conn.getTable("MyTable");
...
// just flushes outstanding commit, no futher cleanup needed, can be omitted.
// HConnection holds no references to the returned HTable objects, they can be 
GC'd as soon as they leave scope.
table.close();
...
conn.close(); // done with the cluster, release resources
----- Snip -----

The HConnection will maintain and share its own ThreadPool for all batch 
operations executed by the HTables.
This can overridden per HConnection and/or per individual HTable object.

I will commit the new API to all branches early next week.

Questions? Comments? Concerns? Praise?

-- Lars

Reply via email to