Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/8752 )
Change subject: IMPALA-5058: Improve the concurrency of DDL/DML operations ...................................................................... Patch Set 2: Code-Review+1 (10 comments) I've spent some time on this patch. The logic and code structure LGTM. I agree with Alex that we should get this committed and then keep making improvements. Given the number of edge cases around SYNC_DDL / invalidate/ statestore broadcast failures we should probably run a stress test too with some injected failures in those paths. http://gerrit.cloudera.org:8080/#/c/8752/2/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/8752/2/be/src/catalog/catalog-server.cc@331 PS2, Line 331: VLOG(1) << "Publishing " << (topic_deletions ? "deletion " : "update ") This could probably be verbose if we print for every catalog_object? I agree it helps to debug in builds, but just wanted to confirm if that was intentional. (Since we have dynamic log levels, we can move it to VLOG(3) as well if you think that is ok). http://gerrit.cloudera.org:8080/#/c/8752/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8752/2/be/src/service/impala-server.cc@1474 PS2, Line 1474: VLOG_QUERY << "Done waiting for catalog version: " << catalog_update_version; nit: We should probably not log this if we detect a catalog_service_id change and indicate in the logs that it had been restarted or something? (same in other cases too) http://gerrit.cloudera.org:8080/#/c/8752/2/common/thrift/CatalogService.thrift File common/thrift/CatalogService.thrift: http://gerrit.cloudera.org:8080/#/c/8752/2/common/thrift/CatalogService.thrift@57 PS2, Line 57: / True if this is a result of an INVALIDATE METADATA operation. : 4: required bool is_invalidate Just curious, wouldn't we know this information from the ClientRequestState calling ProcessCatalogUpdateResult()? Wondering why the Catalog server has to send this back to the coordinator. http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java File fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java: http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java@89 PS2, Line 89: CatalogObjectVersionQueue.INSTANCE.removeVersion( : existingRole.getCatalogVersion()); Wouldn't removeRole() -> roleCache_.remove(roleName) already do this? Also curious, why this is being done for a very specific case? http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java File fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java: http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java@30 PS2, Line 30: * versions. Not thread-safe. nit: I think it helps to define the use case of this class here. (like described in CatalogDeltaLog) http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@585 PS2, Line 585: = nit: maybe make >= ? just to be on the safer side. http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@628 PS2, Line 628: if (LOG.isDebugEnabled()) { : LOG.debug(String.format("Error calling toThrift() on table %s: %s", : tbl.getFullName(), e.getMessage()), e); : } : return; I think that is a valid LOG.error(), given we are missing an update of a version. http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1304 PS2, Line 1304: versionLock_.writeLock().lock(); qq: I see incrementAndGetCatalogVersion() already takes versionLock_. Whats the reason behind wrapping these calls (at multiple places) with this lock again? Is that maintain some additional consistency across the duration of the calls? http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1868 PS2, Line 1868: LOG.info("Operation using SYNC_DDL is waiting for catalog topic version: " + Wondering if we can make it easy someway to map this log to the corresponding DDL (supportability) http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@381 PS2, Line 381: CatalogObjectVersionQueue.INSTANCE.updateVersions( : existingDb.getCatalogVersion(), catalogVersion); Same as I commented in another place, wouldn't addDb() do that already? addDb() -> CatalogObjectCache ? -- To view, visit http://gerrit.cloudera.org:8080/8752 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If12467a83acaeca6a127491d89291dedba91a35a Gerrit-Change-Number: 8752 Gerrit-PatchSet: 2 Gerrit-Owner: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang <[email protected]> Gerrit-Comment-Date: Sun, 17 Dec 2017 00:10:15 +0000 Gerrit-HasComments: Yes
