Dimitris Tsirogiannis has posted comments on this change. ( http://gerrit.cloudera.org:8080/8545 )
Change subject: [PREVIEW] IMPALA-5058: Improve concurrency of DDL/DML operations ...................................................................... Patch Set 2: (21 comments) http://gerrit.cloudera.org:8080/#/c/8545/2/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/8545/2/be/src/service/client-request-state.cc@187 PS2, Line 187: reset_req.reset_metadata_params.__set_sync_ddl( > Why do we need to set sync_ddl twice? It's because we're not always consistent in how we process different requests. TCatalogOpRequest::sync_ddl is now a required field. At the same time some operations process the *params field and don't have access to the parent TCatalogOpRequest, so the value of the sync_ddl flag is lost. http://gerrit.cloudera.org:8080/#/c/8545/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8545/2/be/src/service/impala-server.cc@1430 PS2, Line 1430: LOG(INFO) << "Update applied with version: " << new_catalog_version > Catalog update applied... Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionManager.java File fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionManager.java: http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionManager.java@29 PS2, Line 29: public class CatalogObjectVersionManager { > Name is not very telling. Some alternatives: Went with option #1. http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionManager.java@37 PS2, Line 37: public void updateCatalogObjectVersions(long oldVersion, long newVersion) { > Do you think it's worth checking the return codes of the calls to objectVer Yes, added check on the remove path. The version should correspond to an existing object that is about to be removed, so the version should exist in the queue. http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@86 PS2, Line 86: * and processing of DDL requests. For each DDL request, the CatalogServiceCatalog > This description of the versioning scheme is not accurate anymore. Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@96 PS2, Line 96: * - Delete log. The delete log records catalog objects that have been removed from the > Let's give an intuition for the purpose of this log, e.g.: Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@97 PS2, Line 97: * catalog. An entry is added to this log every time an object is removed (e.g. > An entry with a new version is added ... Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@99 PS2, Line 99: * be perfomed atomically. An entry is removed from this log from the topic update > removed from this log by the topic update thread Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@102 PS2, Line 102: * been processed by the topic update thread. The topic update thread is the only one > that have been included in a topic update. Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@105 PS2, Line 105: * Information per catalog object includes the number of times a catalog object has > Each entry includes the number of times a catalog object has been omitted i Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@111 PS2, Line 111: * version that the issueing impalad must wait for in order to ensure that the effects > typo: issuing Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@112 PS2, Line 112: * of this operation have been broadcast to all the coordinators. The time-based cleanup > Why did we chose a time-based cleanup then? I think the time-based cleanup Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@129 PS2, Line 129: * that should be employed if both the catalog lock and table locks need to be held at > catalog lock -> version lock? Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@183 PS2, Line 183: // Version of the last topic update sent to the statestore. > sent -> returned (to make the control flow clearer) Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@192 PS2, Line 192: private final static int TOPIC_UPDATE_LOG_GC_FREQUENCY = 50; > Mention how this affects the lifetime of an individual entry. My understand Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@195 PS2, Line 195: p > Let's create a TopicUpdateLog class that encapsulates these counters and th Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@258 PS2, Line 258: private final Object topicUpdateEventNotifier_ = new Object(); > Maybe we should use the topicUpdateLog_ for this purpose once we have a sep Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1228 PS2, Line 1228: private void removeDbHelper(Db db) { > updateDeleteLog(Db db)? Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1938 PS2, Line 1938: topicUpdateLog_.get(CatalogDeltaLog.toCatalogObjectKey(tCatalogObject)); > Move toCatalogObjectKey() to Catalog, referring to the delta log here is a Done http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1976 PS2, Line 1976: if (topicUpdateLog_.remove(entry.getKey(), entry.getValue())) { > Are uou sure it's ok to remove() while iterating over the map? Might be cle It's a ConcurrentHashMap, so it doesn't care for concurrent modifications. Besides, I don't want to just use remove on an iterator because I need to check if the value of the to-be-removed object has changed and abort if that is the case. The last two operations need to be performed atomically and iterator.remove does not provide that functionality. http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@179 PS2, Line 179: // functions and priviles) and then the top-level objects (databases and roles). > typo: privlieges Done -- To view, visit http://gerrit.cloudera.org:8080/8545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3032437f83d39bcc8cff14d897c7c106a4ab62d3 Gerrit-Change-Number: 8545 Gerrit-PatchSet: 2 Gerrit-Owner: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Comment-Date: Fri, 01 Dec 2017 23:49:00 +0000 Gerrit-HasComments: Yes
