Hello Bharath Vissapragada, Tianyi Wang, Impala Public Jenkins, Vuk Ercegovac,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/11280
to look at the new patch set (#3).
Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic
updates
......................................................................
IMPALA-7469. Invalidate LocalCatalog cache based on topic updates
This implements cache invalidation inside CatalogdMetaProvider. The
design is as follows:
- when the catalogd collects updates into the statestore topic, it now
adds an additional entry for each table and database. These additional
entries are minimal - they only include the object's name, but no
metadata. This new behavior is conditional on a new flag
--catalog_topic_mode. The default mode is to keep the old style, but
it can be configured to mixed (support both v1 and v2) or v2-only.
- the old-style topic entries are prefixed with a '1:' whereas the new
minimal entries are prefixed with a '2:'. The impalad will subscribe
to one or the other prefix depending on whether it is running with
--use_local_catalog. Thus, old impalads will not be confused by the
new entries and vice versa.
- when the impalad gets these topic updates, it forwards them through to
the catalog implementation. The LocalCatalog implementation forwards
them to the CatalogdMetaProvider, which uses them to invalidate
cached metadata as appropriate.
This patch includes some basic unit tests. I also did some manual
testing by connecting to different impalads and verifying that a session
connected to impalad #1 saw the effects of DDLs made by impalad #2
within a short period of time (the statestore topic update frequency).
Existing end-to-end tests cover these code paths pretty thoroughly:
- if we didn't automatically invalidate the cache on a coordinator
in response to DDL operations, then any test which expects to
"read its own writes" (eg access a table after creating one)
would fail
- if we didn't propagate invalidations via the statestore, then
all of the tests that use sync_ddl would fail.
I verified the test coverage above using some of the tests in
test_ddl.py -- I selectively commented out a few of the invalidation
code paths in the new code and verified that tests failed until I
re-introduced them. Along the way I also improved test_ddl so that, when
this code is broken, it properly fails with a timeout. It also has a bit
of expanded coverage for both the SYNC_DDL and non-SYNC cases.
I also wrote a new custom-cluster test for LocalCatalog that verifies
a few of the specific edge cases like detecting catalogd restart.
One notable exception here is the implementation of INVALIDATE METADATA
This turned out to be complex to implement, so I left a lengthy TODO
describing the issue and filed a JIRA.
Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
---
M be/src/catalog/catalog-server.cc
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/common/Pair.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/FeCatalogManager.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
M tests/common/custom_cluster_test_suite.py
A tests/custom_cluster/test_local_catalog.py
M tests/metadata/test_ddl.py
20 files changed, 762 insertions(+), 103 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/11280/3
--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Vuk Ercegovac <[email protected]>