Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11280 )

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
......................................................................


Patch Set 2:

(21 comments)

http://gerrit.cloudera.org:8080/#/c/11280/2/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/11280/2/common/thrift/CatalogObjects.thrift@621
PS2, Line 621:
> perhaps it makes more sense to put these in CatalogService.thrift?
Done


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@429
PS2, Line 429: delete
> is delete ignored for this case?
sort of. The impalad doesn't care whether it's deleted or not, because it just 
invalidates in either case. But, it is important to send the update for deleted 
items. Figured may as well be accurate instead of lying and saying true or 
false, right?


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@422
PS2, Line 422:
             :       // Serialize a minimal version of the object that can be 
used by impalads
             :       // that are running in 'local-catalog' mode. This is used 
by those impalads
             :       // to invalidate their local cache.
             :       TCatalogObject minimalObject = getMinimalObjectForV2(obj);
             :       String v2Key = 
CatalogObjectsConstants.CATALOG_TOPIC_V2_PREFIX + key;
             :       if 
(!FeSupport.NativeAddPendingTopicItem(nativeCatalogServerPtr, v2Key,
             :           obj.catalog_version, 
serializer.serialize(minimalObject), delete)) {
             :         LOG.error("NativeAddPendingTopicItem failed in BE. key=" 
+ v2Key + ", delete="
             :             + delete + ", data_size=" + data.length);
             :       }
> Can we avoid these if there are no subscribers subscribed to this prefix? (
I think it's close to impossible to plumb through so the catalog knows whether 
anyone is subscribed, because it might actually publish this info even before 
impalads have started. That said, to eliminate risks of statestore memory 
regressions and such, I'll add a flag on the catalog to publish v1-only, 
v2-only, or both (allowing for mixed clusters). I'll make the default v1 only. 
That makes an extra config flag to flip to try out LocalCatalog but maybe worth 
avoiding the potential regression for people who don't try it.


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/Table.java@411
PS2, Line 411:         // TODO(todd): this breaks 
test_ddl.test_alter_set_column_stats.
> Is this the fix here to make that test work, or is that test in a currently
It's failing only in local-catalog mode. I think it never worked in this mode, 
I just happened to discover it while working on this patch.


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@255
PS2, Line 255:   public Database loadDb(final String dbName) throws TException {
> nit: Add method docs in general. Also worth mentioning that it tries to get
Given that these override the interface and the interface is docced, I'm not 
sure about duplicating method docs. I'll add a short implementation comment 
here and elaborate the top-level class doc a bit more to talk about the caching 
and invalidation design.


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@263
PS2, Line 263:               TGetPartialCatalogObjectRequest req = 
newReqForDb(dbName);
> hit.setRef(false) (below too), wondering if can make this whole logic of ca
yea, I managed to factor out a method, thanks


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@267
PS2, Line 267: );
> dbname
the exception thrown by checkResponse already includes the request stringified, 
so should be in there already


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@275
PS2, Line 275:       LOG.trace("Request for DB metadata for {}: {}", dbName, 
hit.getRef() ? "hit":"miss");
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@290
PS2, Line 290:               TGetPartialCatalogObjectRequest req = 
newReqForDb(dbName);
> hit.setRef(false)
refactored


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@294
PS2, Line 294: expected HMS table"
> table names?
Done


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@344
PS2, Line 344: ");
> table name
see above


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@346
PS2, Line 346:                   dbName, tableName, resp.table_info.hms_table, 
resp.object_version_number);
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@619
PS2, Line 619:     LOG.trace("loadFunctionNames({}): not cached yet", dbName);
> This reminds me of https://github.com/apache/impala/commit/352833b8cfcf5e02
using isTraceEnabled, etc, makes sense if you are using + to concatenate 
strings, because it saves the concatenation. But, in this case we're using the 
{} placeholder support, so it doesn't really make much of a difference. slf4j  
includes overloads for up to two placeholder values before it ends up 
allocating an Object[] for varargs, see 
https://www.slf4j.org/api/org/slf4j/Logger.html#trace(java.lang.String,%20java.lang.Object,%20java.lang.Object)


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@647
PS2, Line 647:           catalogServiceId_ = req.catalog_service_id;
> I think we should detect restarts here and log
Done


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@688
PS2, Line 688: anyuthing
> nit: typo
Done


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@699
PS2, Line 699: ineffective
> how is MAX_VALUE handled? is it ignored as a special case? from the descrip
I ended up filing a JIRA to handle this, and for now I made INVALIDATE METADATA 
throw an error on local-catalog mode. It turns out even non-SYNC_DDL has some 
weird semantics currently


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@735
PS2, Line 735: debug
> debug is the default, you mean trace?
I found it useful to actually see the invalidations, whereas the trace level 
stuff is super noisy


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@747
PS2, Line 747: asMap().remove
> isn't invalidate() the right method? Are you using this for return val?
yea, invalidate() doesn't indicate whether it did anything


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@896
PS2, Line 896: VersionedTableCacheKey
> Where is this used? It looks like loadTableNames() uses the base TableCache
the other cache keys below inherit from this


http://gerrit.cloudera.org:8080/#/c/11280/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@970
PS2, Line 970: TODO(todd): cu
> this change looks like it covers this todo.
not quite. The TODO is now inaccurate because the parent class changed names, 
but the actual TODO is still relevant. It would be nice to _NOT_ use the table 
version number in the key, but instead assume that partitions are immutable 
given an ID, so that we dont need to reload them all when only one changes. 
That requires some catalogd side changes though.


http://gerrit.cloudera.org:8080/#/c/11280/2/tests/metadata/test_ddl.py
File tests/metadata/test_ddl.py:

http://gerrit.cloudera.org:8080/#/c/11280/2/tests/metadata/test_ddl.py@572
PS2, Line 572: )
> flake8: E501 line too long (91 > 90 characters)
Done



--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Vuk Ercegovac <[email protected]>
Gerrit-Comment-Date: Wed, 29 Aug 2018 08:24:32 +0000
Gerrit-HasComments: Yes

Reply via email to