[
https://issues.apache.org/jira/browse/HBASE-17009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-17009:
--------------------------
Fix Version/s: 2.0.0
> Revisiting the removement of managed connection and connection caching
> ----------------------------------------------------------------------
>
> Key: HBASE-17009
> URL: https://issues.apache.org/jira/browse/HBASE-17009
> Project: HBase
> Issue Type: Task
> Components: Operability
> Reporter: Yu Li
> Assignee: Yu Li
> Priority: Critical
> Fix For: 2.0.0
>
>
> In HBASE-13197 we have done lots of good cleanups for Connection API, but
> among which HBASE-13252 dropped the feature of managed connection and
> connection caching, and this JIRA propose to have a revisit on this decision
> for below reasons.
> Assume we have a long running process with multiple threads accessing HBase
> (a common case for streaming application), let's see what happens previously
> and now.
> Previously:
> User could create an HTable instance whenever they want w/o worrying about
> the underlying connections because HBase client will mange it automatically,
> say no matter how many threads there will be only one Connection instance
> {code}
> @Deprecated
> public HTable(Configuration conf, final TableName tableName)
> throws IOException {
> ...
> this.connection = ConnectionManager.getConnectionInternal(conf);
> ...
> }
> static ClusterConnection getConnectionInternal(final Configuration conf)
> throws IOException {
> HConnectionKey connectionKey = new HConnectionKey(conf);
> synchronized (CONNECTION_INSTANCES) {
> HConnectionImplementation connection =
> CONNECTION_INSTANCES.get(connectionKey);
> if (connection == null) {
> connection = (HConnectionImplementation)createConnection(conf, true);
> CONNECTION_INSTANCES.put(connectionKey, connection);
> } else if (connection.isClosed()) {
> ConnectionManager.deleteConnection(connectionKey, true);
> connection = (HConnectionImplementation)createConnection(conf, true);
> CONNECTION_INSTANCES.put(connectionKey, connection);
> }
> connection.incCount();
> return connection;
> }
> }
> {code}
> Now:
> User has to create the connection by themselves, using below codes like
> indicated in our recommendations
> {code}
> Connection connection = ConnectionFactory.createConnection(conf);
> Table table = connection.getTable(tableName);
> {code}
> And they must make sure *only one* single connection created in one *process*
> instead of creating HTable instance freely, or else there might be many
> connections setup to zookeeper/RS with multiple threads. Also user might ask
> "when I should close the connection I close?" and the answer is "make sure
> don't close it until the *process* shutdown"
> So now we have much more things for user to "Make sure", but custom is
> something hard to change. User used to create table instance in each thread
> (according to which table to access per requested) so probably they will
> still create connections everywhere, and then operators will have to crazily
> resolve all kinds of problems...
> So I'm proposing to add back the managed connection and connection caching
> support. IMHO it's something good and ever existed in our implementation, so
> let's bring it back and save the workload for operators when they decided to
> upgrade from 1.x to 2.x
> Thoughts?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)