Tim Armstrong has posted comments on this change. Change subject: IMPALA-2982: lazy initialization of HBase connection ......................................................................
Patch Set 1: (2 comments) The single long-lived application-wide connection is actually the recommended mode of use for HBase connections: https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html It has a bunch of logic to handle HBase servers coming and going, so I think initialising the connection at impalad startup is essentially the same as initialising it at the first HBase query. http://gerrit.cloudera.org:8080/#/c/2345/1/be/src/runtime/hbase-table-factory.cc File be/src/runtime/hbase-table-factory.cc: Line 52: lock_guard<mutex> lock(connection_lock_); > It doesn't seem to me like a mutex is really of too much help here. It woul How can the connection_ member get modified by the caller? We copy the pointer. The lock protects against concurrent initialisation also: we need a lock for that since we need to block any threads that arrive while the first thread is initialising the connection. Line 104: if (connection_ != NULL) { > And do an atomic swap() to NULL out connection here, and store it's value l I'd rather use locks here because it's easier to reason about, and because it's much easier to avoid concurrent initialisation of the connection. We could do some kind of double-checked locking to avoid acquiring the lock in the common case, but I don't think this code path is called that often: it looked like this would be called at most every scan range. -- To view, visit http://gerrit.cloudera.org:8080/2345 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I405f5cbfc62dd4a4aeb6f4019cac358376478884 Gerrit-PatchSet: 1 Gerrit-Project: Impala Gerrit-Branch: cdh5-2.5.0_5.7.0 Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
