Dimitris Tsirogiannis has posted comments on this change. ( http://gerrit.cloudera.org:8080/8752 )
Change subject: IMPALA-5058: Improve the concurrency of DDL/DML operations ...................................................................... Patch Set 2: Code-Review+2 (14 comments) Thanks for the reviews. Waiting on the results from the next concurrency testing from Mostafa and if everything looks good, I'll merge the change. http://gerrit.cloudera.org:8080/#/c/8752/2/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/8752/2/be/src/catalog/catalog-server.cc@331 PS2, Line 331: VLOG(1) << "Publishing " << (topic_deletions ? "deletion " : "update ") > This could probably be verbose if we print for every catalog_object? I agr Yes that was intentional. The rational is the following. We already log the catalog objects sent in the catalog so it makes sense to me to log the received items at the impalads. I believe it can help debug some nasty issues (missing updates, etc). For now I suggest we leave it as is. http://gerrit.cloudera.org:8080/#/c/8752/2/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8752/2/be/src/service/impala-server.h@339 PS2, Line 339: // Wait until the minimum catalog object version in the local cache exceeds > /// style Done http://gerrit.cloudera.org:8080/#/c/8752/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8752/2/be/src/service/impala-server.cc@1474 PS2, Line 1474: VLOG_QUERY << "Done waiting for catalog version: " << catalog_update_version; > nit: We should probably not log this if we detect a catalog_service_id chan Good point. Done http://gerrit.cloudera.org:8080/#/c/8752/2/common/thrift/CatalogService.thrift File common/thrift/CatalogService.thrift: http://gerrit.cloudera.org:8080/#/c/8752/2/common/thrift/CatalogService.thrift@57 PS2, Line 57: / True if this is a result of an INVALIDATE METADATA operation. : 4: required bool is_invalidate > Just curious, wouldn't we know this information from the ClientRequestState Unfortunately, that's not always the case so for these cases, the simplest thing to do was to add that information as part of the catalog response. http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java File fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java: http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java@89 PS2, Line 89: CatalogObjectVersionQueue.INSTANCE.removeVersion( : existingRole.getCatalogVersion()); > Wouldn't removeRole() -> roleCache_.remove(roleName) already do this? You're right. It's not needed here. Removed. http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java File fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java: http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java@30 PS2, Line 30: * versions. Not thread-safe. > nit: I think it helps to define the use case of this class here. (like desc Done http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@93 PS2, Line 93: * Periodically, the CatalogServiceCatalog collects a delta of catalog updates (based on a > Thanks! This is really clear to me now. Done http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@95 PS2, Line 95: * Each catalog topic update is defined by a range of catalog versions [from, to) and the > (from, to] Done http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@96 PS2, Line 96: * CatalogServiceCatalog guarantees that very catalog object that has a version in the > typo: every Done http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@585 PS2, Line 585: = > nit: maybe make >= ? just to be on the safer side. Done http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@628 PS2, Line 628: if (LOG.isDebugEnabled()) { : LOG.debug(String.format("Error calling toThrift() on table %s: %s", : tbl.getFullName(), e.getMessage()), e); : } : return; > I think that is a valid LOG.error(), given we are missing an update of a ve Done http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1304 PS2, Line 1304: versionLock_.writeLock().lock(); > qq: I see incrementAndGetCatalogVersion() already takes versionLock_. Whats Exactly. Removing the function, assigning the new version and adding it to the deleteLog should all happen atomically and it's the same for all the remove* functions. http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1868 PS2, Line 1868: LOG.info("Operation using SYNC_DDL is waiting for catalog topic version: " + > Wondering if we can make it easy someway to map this log to the correspondi Yeah, that's a good point. I believe at some point we will have to be able to uniquely identify catalog operations (for tracking/monitoring purposes). Then it will be easy to expand these log messages. http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/8752/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@381 PS2, Line 381: CatalogObjectVersionQueue.INSTANCE.updateVersions( : existingDb.getCatalogVersion(), catalogVersion); > Same as I commented in another place, wouldn't addDb() do that already? add Unfortunately not. dbCache_ is not backed by CatalogObjectCache but by an AtomicReference to a ConcurrentHashMap (don't ask why). -- To view, visit http://gerrit.cloudera.org:8080/8752 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If12467a83acaeca6a127491d89291dedba91a35a Gerrit-Change-Number: 8752 Gerrit-PatchSet: 2 Gerrit-Owner: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com> Gerrit-Comment-Date: Thu, 04 Jan 2018 19:55:04 +0000 Gerrit-HasComments: Yes