Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8752 )
Change subject: IMPALA-5058: Improve the concurrency of DDL/DML operations ...................................................................... Patch Set 1: (25 comments) Here's a first wave of comments. Still thinking about some places in more detail. http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h@328 PS1, Line 328: /// waits all other catalog topic subscribers to process this update by checking the waits for http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h@339 PS1, Line 339: void WaitForMinCatalogUpdate(const int64_t min_catalog_update_version, Add comment http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h@344 PS1, Line 344: void WaitForCatalogUpdateTopicPropagation(const TUniqueId& catalog_service_id); Let's pass in the version to keep these Wait() APIs consistent. http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/CatalogInternalService.thrift File common/thrift/CatalogInternalService.thrift: http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/CatalogInternalService.thrift@39 PS1, Line 39: // List of updated (new and modified) catalog objects for which the catalog verion is catalog objects whose version is larger than ... http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/CatalogInternalService.thrift@43 PS1, Line 43: // List of deleted catalog objects for which the catalog version is larger than catalog objects whose version is larger than ... http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/StatestoreService.thrift File common/thrift/StatestoreService.thrift: http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/StatestoreService.thrift@83 PS1, Line 83: // subscribers need additional information in order to process the deleted topics that This is needed when subscribers require additional information not captured in the item key to process the deleted item (e.g., catalog version of deleted catalog object). http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/StatestoreService.thrift@88 PS1, Line 88: // non-delta TTopicDelta's (since the latest version of every still-present topic will * Remove parenthesis since the explanation is important * Do you mean "still-present item" instead of "still-present topic"? The sentence is somewhat confusing to me, maybe I'm misunderstanding. A non-delta update must include all items so it should also include this one, right? http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java File fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java@91 PS1, Line 91: for (RolePrivilege priv: existingRole.getPrivileges()) { might simplify some places to have bulk versions of the add/remove calls like addAll and removeAll that take a collection of CatalogObjects http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@544 PS1, Line 544: throws IllegalStateException { remove throws since it's a RuntimeException http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@577 PS1, Line 577: * TODO: Use global object IDs everywhere instead of tracking catalog objects by Seems like a pretty big TODO unlikely to get addressed. I suggest removing it. http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@580 PS1, Line 580: public static boolean objectNamesMatch(TCatalogObject first, TCatalogObject second) { keyEquals() or similar since we are actually checking for key equality and not "object name" equality http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java File fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@37 PS1, Line 37: * direct ("fast") update that applies the result of a catalog operation to the cache remove ("fast") http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@53 PS1, Line 53: * The catalogd uses this log to identify deleted catalog objects. Deleted ... to identify objects that have been deleted since the last catalog topic update. http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@55 PS1, Line 55: * them (e.g. dropTable()). While constructing a catalog update topic, we use the log to Sentence in the middle with "While" can be removed imo. The first sentence already gives this info. http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@57 PS1, Line 57: * Once the catalog topic update is constructed, the old deleted catalog objects are the old deleted catalog objects -> the old entries in this log http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java File fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java@28 PS1, Line 28: private AtomicLong catalogVersion_ = new AtomicLong(Catalog.INITIAL_CATALOG_VERSION); Not clear why this has to be atomic, but let's think about that outside of this patch. Commenting here so we don't forget. http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java File fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java@39 PS1, Line 39: public void updateCatalogObjectVersions(long oldVersion, long newVersion) { Method names seem verbose, how about just add/remove/update http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java@53 PS1, Line 53: public long getMinimumObjectVersion() { getMinimumVersion() http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@95 PS1, Line 95: * CatalogServiceCatalog guarantees that very catalog object that has a version in the typo: that every Sentence does not seem accurate. I think we guarantee that all objects included in the update are within that version range - but we don't necessarily include *all* such objects. Some objects are skipped to allow the topic update thread to make progress. We should also mention that the topic update itself is assigned a version (the "to" version). http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@97 PS1, Line 97: * does not prevent DDL requests from making progress while the catalog topic update is Let's phrase this positively: Concurrent DDL requests are allowed while a topic update is in progress. http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@98 PS1, Line 98: * created. Hence, there is a non-zero probability that frequently modified catalog Concept of skipping has not been introduced, need to discuss that above when talking about the nuances of version range, etc. http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@103 PS1, Line 103: * higher than the 'to' version of the topic update. As a result, the same version of an object might be sent in two subsequent topic updates. http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@127 PS1, Line 127: * The time-based cleanup process of the topic update log entries may cause metadata Give this section a title like "Known Anomalies with SYNC_DDL" http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@138 PS1, Line 138: * waitFroSyncDdlVersion() for more details. typo: waitForSyncDdlVersion() http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@224 PS1, Line 224: private class TopicUpdateLog { Can you move this into a separate .java file. This one is already getting huge. -- To view, visit http://gerrit.cloudera.org:8080/8752 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If12467a83acaeca6a127491d89291dedba91a35a Gerrit-Change-Number: 8752 Gerrit-PatchSet: 1 Gerrit-Owner: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 06 Dec 2017 00:28:49 +0000 Gerrit-HasComments: Yes
