Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14307

to look at the new patch set (#2).

Change subject: IMPALA-7506: support global INVALIDATE METADATA in local 
catalog mode
......................................................................

IMPALA-7506: support global INVALIDATE METADATA in local catalog mode

In local catalog mode, the coordinator does not cache all the metadata.
Instead, it caches them on-demand (based on query requests), and
removes them based on the Guava cache configurations (e.g. size or TTL).
We use the catalog version as part of the cache key for fine-grained
metadata, e.g partition meta. When invalidating a table, we simply
invalidate the top-level table entry, and allow other information to
remain in the cache. The old metadata will be lazily removed by Guava
cache since they won't be touched anymore. Thus, there're bunch of stale
metadata in the cache so we can't track the minimal catalog version of
valid catalog objects efficiently.

The minimal catalog version of valid catalog objects is used to
implement global invalidate metadata. In legacy catalog mode, all cached
catalog objects are valid in fact. Coordinator gets the expected min
catalog version in the RPC response from Catalogd. It's the version when
Catalogd starts to reset the entire catalog, which means when the reset
is done, all valid catalog objects should be associated with a catalog
version larger than it. Coordinator will wait until its min catalog
version exceeds this value, which means it has processed all the updates
of the reset propagated from the catalogd via statestored. If SYNC_DDL
is set, the coordinator will also wait until other coordinators reach
the same catalog version with it, so they can also see the latest update
of reset.

This patch adds a new field (lastResetCatalogVersion) in TCatalog to
keep the catalog version when catalogd starts to reset the entire
metadata. Each time when catalogd generates a new topic update for
catalog topic, it will generate a TCatalogObject in CATALOG type
containing the state of the catalog which includes this new field.

When coordinator receives a new value of lastResetCatalogVersion in a
topic update, it means catalogd has reset the entire catalog and all the
relative updates are whether included in the same or previous topic
updates. This is guaranteed by the fact that the write lock of
versionLock is held when catalogd resetting the entire catalog, and we
update lastResetCatalogVersion at last before releasing the write lock.
So the update thread which requires holding the read lock of versionLock
don't have chance to propagate partial results with the new
lastResetCatalogVersion value. Thus, all metadata with catalog version <=
lastResetCatalogVersion can be considered stale after coordinator finish
processing the topic update. lastResetCatalogVersion + 1 is the lower
bound (included) of min catalog version of a coordinator.

Tests:
 - Recover all existing tests that have been disabled due to this
   missing feature

Change-Id: Ib61a7ab1ffa062620ffbc2dadc34bd7a8ca9e549
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M tests/authorization/test_ranger.py
M tests/common/skip.py
M tests/custom_cluster/test_local_catalog.py
M tests/metadata/test_hms_integration.py
M tests/metadata/test_metadata_query_statements.py
9 files changed, 51 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/14307/2
--
To view, visit http://gerrit.cloudera.org:8080/14307
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib61a7ab1ffa062620ffbc2dadc34bd7a8ca9e549
Gerrit-Change-Number: 14307
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>

Reply via email to