Hello Bharath Vissapragada, Tianyi Wang, Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11280

to look at the new patch set (#2).

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
......................................................................

IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

This implements cache invalidation inside CatalogdMetaProvider. The
design is as follows:

- when the catalogd collects updates into the statestore topic, it now
  adds an additional entry for each table and database. These additional
  entries are minimal - they only include the object's name, but no
  metadata.

- the old-style topic entries are prefixed with a '1:' whereas the new
  minimal entries are prefixed with a '2:'. The impalad will subscribe
  to one or the other prefix depending on whether it is running with
  --use_local_catalog. Thus, old impalads will not be confused by the
  new entries and vice versa.

- when the impalad gets these topic updates, it forwards them through to
  the catalog implementation. The LocalCatalog implementation forwards
  them to the CatalogdMetaProvider, which uses them to invalidate
  cached metadata as appropriate.

This patch includes some basic unit tests. I also did some manual
testing by connecting to different impalads and verifying that a session
connected to impalad #1 saw the effects of DDLs made by impalad #2
within a short period of time (the statestore topic update frequency).

Existing end-to-end tests cover these code paths pretty thoroughly:

- if we didn't automatically invalidate the cache on a coordinator
  in response to DDL operations, then any test which expects to
  "read its own writes" (eg access a table after creating one)
  would fail
- if we didn't propagate invalidations via the statestore, then
  all of the tests that use sync_ddl would fail.

I verified the test coverage above using some of the tests in
test_ddl.py -- I selectively commented out a few of the invalidation
code paths in the new code and verified that tests failed until I
re-introduced them. Along the way I also improved test_ddl so that, when
this code is broken, it properly fails with a timeout. It also has a bit
of expanded coverage for both the SYNC_DDL and non-SYNC cases.

One notable exception here is the implementation of SYNC_DDL for
INVALIDATE METADATA. This turned out to be complex to implement, so I
left a lengthy TODO describing the issue.

Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
---
M be/src/service/impala-server.cc
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/common/Pair.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/FeCatalogManager.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
M tests/metadata/test_ddl.py
14 files changed, 525 insertions(+), 88 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/11280/2
--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>

Reply via email to