Sailesh Mukil has posted comments on this change. Change subject: IMPALA-2982: lazy initialization of HBase connection ......................................................................
Patch Set 1: (5 comments) > (2 comments) > > The single long-lived application-wide connection is actually the > recommended mode of use for HBase connections: > https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html > > It has a bunch of logic to handle HBase servers coming and going, > so I think initialising the connection at impalad startup is > essentially the same as initialising it at the first HBase query. Sounds good to me. But we should run it on a HBase cluster just to be sure and if we have the time, try out things like cycle the Hbase service while impala is running etc. I can help in this regard if you don't have the time. http://gerrit.cloudera.org:8080/#/c/2345/1/be/src/runtime/hbase-table-factory.cc File be/src/runtime/hbase-table-factory.cc: Line 40: connection_cl_ = : env->FindClass("org/apache/hadoop/hbase/client/Connection"); Single line. Line 52: lock_guard<mutex> lock(connection_lock_); > How can the connection_ member get modified by the caller? We copy the poin You're right. I was just thinking along the lines of a concurrent GetConnection() and destruction. Line 91: RETURN_IF_ERROR( : JniUtil::LocalToGlobalRef(env, local_connection, &connection_)); Can fit in a single line. Line 104: if (connection_ != NULL) { > I'd rather use locks here because it's easier to reason about, and because It wouldn't matter if we're holding a lock and concurrent initialization happens. It would still continue with the initialization after this releases the lock would have the same effect. But as you said since this isn't a frequently visited code path, we can just leave it as a lock. Which leads me to think, should we find some way of signaling GetConnection() the the connection_ has been closed? Or NULL out connection_ if we need it to be reinitialized again. It also seems odd that if there are 2 HBaseTableFactory objects and one gets destroyed, both of them would lose the connection, since conenction_ is a static member. http://gerrit.cloudera.org:8080/#/c/2345/1/be/src/runtime/hbase-table-factory.h File be/src/runtime/hbase-table-factory.h: Line 41: /// Opens the connections to HBase and Zookeeper on the first call. Add "Returns existing connection on subsequent calls". -- To view, visit http://gerrit.cloudera.org:8080/2345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I405f5cbfc62dd4a4aeb6f4019cac358376478884 Gerrit-PatchSet: 1 Gerrit-Project: Impala Gerrit-Branch: cdh5-2.5.0_5.7.0 Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
