Dimitris Tsirogiannis has posted comments on this change. ( http://gerrit.cloudera.org:8080/8752 )
Change subject: IMPALA-5058: Improve the concurrency of DDL/DML operations ...................................................................... Patch Set 1: (27 comments) http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h@328 PS1, Line 328: /// waits all other catalog topic subscribers to process this update by checking the > waits for Done http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h@339 PS1, Line 339: void WaitForMinCatalogUpdate(const int64_t min_catalog_update_version, > Add comment Done http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h@344 PS1, Line 344: void WaitForCatalogUpdateTopicPropagation(const TUniqueId& catalog_service_id); > Let's pass in the version to keep these Wait() APIs consistent. The version comes from the catalog_update_info_ which should be protected under the catalog_version_lock_. If we pass it as a param we'll have to move the lock outside that function which is not ideal. http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/CatalogInternalService.thrift File common/thrift/CatalogInternalService.thrift: http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/CatalogInternalService.thrift@39 PS1, Line 39: // List of updated (new and modified) catalog objects for which the catalog verion is > catalog objects whose version is larger than ... Done http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/CatalogInternalService.thrift@43 PS1, Line 43: // List of deleted catalog objects for which the catalog version is larger than > catalog objects whose version is larger than ... Done http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/StatestoreService.thrift File common/thrift/StatestoreService.thrift: http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/StatestoreService.thrift@83 PS1, Line 83: // subscribers need additional information in order to process the deleted topics that > This is needed when subscribers require additional information not captured Done http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/StatestoreService.thrift@88 PS1, Line 88: // non-delta TTopicDelta's (since the latest version of every still-present topic will > * Remove parenthesis since the explanation is important No, an optimization was added at some point to avoid sending deletions for the case non-delta updates. It's "still-present topic item". http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java File fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java@91 PS1, Line 91: for (RolePrivilege priv: existingRole.getPrivileges()) { > might simplify some places to have bulk versions of the add/remove calls li Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@544 PS1, Line 544: throws IllegalStateException { > remove throws since it's a RuntimeException Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@577 PS1, Line 577: * TODO: Use global object IDs everywhere instead of tracking catalog objects by > Seems like a pretty big TODO unlikely to get addressed. I suggest removing Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@580 PS1, Line 580: public static boolean objectNamesMatch(TCatalogObject first, TCatalogObject second) { > keyEquals() or similar since we are actually checking for key equality and Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java File fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@37 PS1, Line 37: * direct ("fast") update that applies the result of a catalog operation to the cache > remove ("fast") Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@53 PS1, Line 53: * The catalogd uses this log to identify deleted catalog objects. Deleted > ... to identify objects that have been deleted since the last catalog topic Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@55 PS1, Line 55: * them (e.g. dropTable()). While constructing a catalog update topic, we use the log to > Sentence in the middle with "While" can be removed imo. The first sentence Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@57 PS1, Line 57: * Once the catalog topic update is constructed, the old deleted catalog objects are > the old deleted catalog objects -> the old entries in this log Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java File fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java@75 PS1, Line 75: CatalogObjectVersionQueue.INSTANCE.addCatalogObjectVersion( > catalogObject.getCatalogVersion() is used here See my response on CatalogServiceCatalog. http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java File fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java@28 PS1, Line 28: private AtomicLong catalogVersion_ = new AtomicLong(Catalog.INITIAL_CATALOG_VERSION); > Not clear why this has to be atomic, but let's think about that outside of I need to be able to access this without holding any other lock, in particular the table lock. http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java File fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java@39 PS1, Line 39: public void updateCatalogObjectVersions(long oldVersion, long newVersion) { > Method names seem verbose, how about just add/remove/update Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java@53 PS1, Line 53: public long getMinimumObjectVersion() { > getMinimumVersion() Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@95 PS1, Line 95: * CatalogServiceCatalog guarantees that very catalog object that has a version in the > typo: that every We don't skip elements whose version is in that range. We skip elements that are modified when the topic is collected and will have a version that is larger than toVersion. http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@97 PS1, Line 97: * does not prevent DDL requests from making progress while the catalog topic update is > Let's phrase this positively: Concurrent DDL requests are allowed while a t Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@98 PS1, Line 98: * created. Hence, there is a non-zero probability that frequently modified catalog > Concept of skipping has not been introduced, need to discuss that above whe Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@103 PS1, Line 103: * higher than the 'to' version of the topic update. > As a result, the same version of an object might be sent in two subsequent Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@127 PS1, Line 127: * The time-based cleanup process of the topic update log entries may cause metadata > Give this section a title like "Known Anomalies with SYNC_DDL" Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@138 PS1, Line 138: * waitFroSyncDdlVersion() for more details. > typo: waitForSyncDdlVersion() Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@224 PS1, Line 224: private class TopicUpdateLog { > Can you move this into a separate .java file. This one is already getting h Done http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1467 PS1, Line 1467: dataSource.setCatalogVersion(incrementAndGetCatalogVersion()); > Here add() is called before setCatalogVersion(), so the version could be un It doesn't really matter for the CatalogServiceCatalog as it doesn't rely on the catalog versions when adding/removing objects. This logic is needed in the ImpaladCatalog.java. -- To view, visit http://gerrit.cloudera.org:8080/8752 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If12467a83acaeca6a127491d89291dedba91a35a Gerrit-Change-Number: 8752 Gerrit-PatchSet: 1 Gerrit-Owner: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com> Gerrit-Comment-Date: Tue, 12 Dec 2017 00:32:06 +0000 Gerrit-HasComments: Yes