Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11280 )
Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic updates ...................................................................... Patch Set 2: (21 comments) http://gerrit.cloudera.org:8080/#/c/11280/2/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/11280/2/common/thrift/CatalogObjects.thrift@621 PS2, Line 621: > perhaps it makes more sense to put these in CatalogService.thrift? Done http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@429 PS2, Line 429: delete > is delete ignored for this case? sort of. The impalad doesn't care whether it's deleted or not, because it just invalidates in either case. But, it is important to send the update for deleted items. Figured may as well be accurate instead of lying and saying true or false, right? http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@422 PS2, Line 422: : // Serialize a minimal version of the object that can be used by impalads : // that are running in 'local-catalog' mode. This is used by those impalads : // to invalidate their local cache. : TCatalogObject minimalObject = getMinimalObjectForV2(obj); : String v2Key = CatalogObjectsConstants.CATALOG_TOPIC_V2_PREFIX + key; : if (!FeSupport.NativeAddPendingTopicItem(nativeCatalogServerPtr, v2Key, : obj.catalog_version, serializer.serialize(minimalObject), delete)) { : LOG.error("NativeAddPendingTopicItem failed in BE. key=" + v2Key + ", delete=" : + delete + ", data_size=" + data.length); : } > Can we avoid these if there are no subscribers subscribed to this prefix? ( I think it's close to impossible to plumb through so the catalog knows whether anyone is subscribed, because it might actually publish this info even before impalads have started. That said, to eliminate risks of statestore memory regressions and such, I'll add a flag on the catalog to publish v1-only, v2-only, or both (allowing for mixed clusters). I'll make the default v1 only. That makes an extra config flag to flip to try out LocalCatalog but maybe worth avoiding the potential regression for people who don't try it. http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/Table.java@411 PS2, Line 411: // TODO(todd): this breaks test_ddl.test_alter_set_column_stats. > Is this the fix here to make that test work, or is that test in a currently It's failing only in local-catalog mode. I think it never worked in this mode, I just happened to discover it while working on this patch. http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@255 PS2, Line 255: public Database loadDb(final String dbName) throws TException { > nit: Add method docs in general. Also worth mentioning that it tries to get Given that these override the interface and the interface is docced, I'm not sure about duplicating method docs. I'll add a short implementation comment here and elaborate the top-level class doc a bit more to talk about the caching and invalidation design. http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@263 PS2, Line 263: TGetPartialCatalogObjectRequest req = newReqForDb(dbName); > hit.setRef(false) (below too), wondering if can make this whole logic of ca yea, I managed to factor out a method, thanks http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@267 PS2, Line 267: ); > dbname the exception thrown by checkResponse already includes the request stringified, so should be in there already http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@275 PS2, Line 275: LOG.trace("Request for DB metadata for {}: {}", dbName, hit.getRef() ? "hit":"miss"); > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@290 PS2, Line 290: TGetPartialCatalogObjectRequest req = newReqForDb(dbName); > hit.setRef(false) refactored http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@294 PS2, Line 294: expected HMS table" > table names? Done http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@344 PS2, Line 344: "); > table name see above http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@346 PS2, Line 346: dbName, tableName, resp.table_info.hms_table, resp.object_version_number); > line too long (92 > 90) Done http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@619 PS2, Line 619: LOG.trace("loadFunctionNames({}): not cached yet", dbName); > This reminds me of https://github.com/apache/impala/commit/352833b8cfcf5e02 using isTraceEnabled, etc, makes sense if you are using + to concatenate strings, because it saves the concatenation. But, in this case we're using the {} placeholder support, so it doesn't really make much of a difference. slf4j includes overloads for up to two placeholder values before it ends up allocating an Object[] for varargs, see https://www.slf4j.org/api/org/slf4j/Logger.html#trace(java.lang.String,%20java.lang.Object,%20java.lang.Object) http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@647 PS2, Line 647: catalogServiceId_ = req.catalog_service_id; > I think we should detect restarts here and log Done http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@688 PS2, Line 688: anyuthing > nit: typo Done http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@699 PS2, Line 699: ineffective > how is MAX_VALUE handled? is it ignored as a special case? from the descrip I ended up filing a JIRA to handle this, and for now I made INVALIDATE METADATA throw an error on local-catalog mode. It turns out even non-SYNC_DDL has some weird semantics currently http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@735 PS2, Line 735: debug > debug is the default, you mean trace? I found it useful to actually see the invalidations, whereas the trace level stuff is super noisy http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@747 PS2, Line 747: asMap().remove > isn't invalidate() the right method? Are you using this for return val? yea, invalidate() doesn't indicate whether it did anything http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@896 PS2, Line 896: VersionedTableCacheKey > Where is this used? It looks like loadTableNames() uses the base TableCache the other cache keys below inherit from this http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@970 PS2, Line 970: TODO(todd): cu > this change looks like it covers this todo. not quite. The TODO is now inaccurate because the parent class changed names, but the actual TODO is still relevant. It would be nice to _NOT_ use the table version number in the key, but instead assume that partitions are immutable given an ID, so that we dont need to reload them all when only one changes. That requires some catalogd side changes though. http://gerrit.cloudera.org:8080/#/c/11280/2/tests/metadata/test_ddl.py File tests/metadata/test_ddl.py: http://gerrit.cloudera.org:8080/#/c/11280/2/tests/metadata/test_ddl.py@572 PS2, Line 572: ) > flake8: E501 line too long (91 > 90 characters) Done -- To view, visit http://gerrit.cloudera.org:8080/11280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Gerrit-Change-Number: 11280 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tianyi Wang <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Comment-Date: Wed, 29 Aug 2018 08:24:32 +0000 Gerrit-HasComments: Yes
