Hello Bharath Vissapragada, Tianyi Wang, Impala Public Jenkins, Vuk Ercegovac,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11280 to look at the new patch set (#2). Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic updates ...................................................................... IMPALA-7469. Invalidate LocalCatalog cache based on topic updates This implements cache invalidation inside CatalogdMetaProvider. The design is as follows: - when the catalogd collects updates into the statestore topic, it now adds an additional entry for each table and database. These additional entries are minimal - they only include the object's name, but no metadata. - the old-style topic entries are prefixed with a '1:' whereas the new minimal entries are prefixed with a '2:'. The impalad will subscribe to one or the other prefix depending on whether it is running with --use_local_catalog. Thus, old impalads will not be confused by the new entries and vice versa. - when the impalad gets these topic updates, it forwards them through to the catalog implementation. The LocalCatalog implementation forwards them to the CatalogdMetaProvider, which uses them to invalidate cached metadata as appropriate. This patch includes some basic unit tests. I also did some manual testing by connecting to different impalads and verifying that a session connected to impalad #1 saw the effects of DDLs made by impalad #2 within a short period of time (the statestore topic update frequency). Existing end-to-end tests cover these code paths pretty thoroughly: - if we didn't automatically invalidate the cache on a coordinator in response to DDL operations, then any test which expects to "read its own writes" (eg access a table after creating one) would fail - if we didn't propagate invalidations via the statestore, then all of the tests that use sync_ddl would fail. I verified the test coverage above using some of the tests in test_ddl.py -- I selectively commented out a few of the invalidation code paths in the new code and verified that tests failed until I re-introduced them. Along the way I also improved test_ddl so that, when this code is broken, it properly fails with a timeout. It also has a bit of expanded coverage for both the SYNC_DDL and non-SYNC cases. One notable exception here is the implementation of SYNC_DDL for INVALIDATE METADATA. This turned out to be complex to implement, so I left a lengthy TODO describing the issue. Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 --- M be/src/service/impala-server.cc M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java M fe/src/main/java/org/apache/impala/common/Pair.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/FeCatalogManager.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java M tests/metadata/test_ddl.py 14 files changed, 525 insertions(+), 88 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/11280/2 -- To view, visit http://gerrit.cloudera.org:8080/11280 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984 Gerrit-Change-Number: 11280 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>